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Tr.is  program  dealt  with  the  development  and  application  of  new  approaches  for  producing  and 
evaluatina  lumped  parameter  models  of  physical  processes.  Local  and  global  sensitivity 
analysis  procedures  were  studied  for  achieving  this  goal.  Specifically,  Lie  group  formalism 
was  developed  to  address  global  parameter  space  mapping  issues  of  temporal  kinetics  problems 
and  extended  to  more  com;  lex  reactive-diffusive  problems.  Furthermore,  Lie  group  theoretical 
techniques  were  also  used  to  gain  analytic  insight  into  the  solution  of  nonlinear  kinetic 
systems.  Using  local  gradient  methods,  the  lumnabilitv  (or  model  reduction)  of  hvdronen/oxy- 
gen  and  carbon  monoxide/hydroaen/oxvgen  kinetic  mechanisms  were  studied  in  various  physical 
environments.  It  was  found  that  the  presence  of  strong  scaling  and  self  similarity  in  the 
sensitivities  allowed  for  kinetic  model  simplification.  Such  scaling  and  similarity  was  fount 
associated  with  strong  thermal  coupling  in  the  systems.  Lastly,  a  general  analysis  method  for 
the  exact  lumping  of  chemical  kinetic  mechanisms  was  dev- loped  and  illustrated  by  simple 
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INTRODUCTION 

The  design  and  optimization  of  realistic  engineering  combustion  devices 
involves  the  construction  and  execution  of  complex  mathematical  models.  These 
models  will  typically  involve  combustion  kinetics  as  well  as  transport 
processes  with  the  physics  and  chemistry  described  by  many  parameters  which 
are  imprecisely  known.  In  addition,  existing  freedom  for  choosing  combustion 
chamber  design  will  introduce  other  potentially  controllable  parameters  into 
the  model.  Therefore,  a  central  problem  in  all  design  problems  concerns  an 
understanding  of  system  performance  with  respect  to  its  parameter  values. 
Except  for  the  simplest  models,  such  an  understanding  will  necessitate 
extensive  computer  calculat ions,  and  repeated  execution  for  each  new  set  of 
parameters  will  lead  to  prohibitive  expense.  The  goal  of  this  program  is  to 
provide  new  insights  as  to  how  to  simplify  detailed  submodels  which  cause  the 
overall  system  calculations  to  be  prohibitively  difficult,  and  to  exercise  the 
techniques  to  develop  simplified  chemical  kinetic  models  which  provide 
sufficient  detail  for  generating  accurate  modeling  results.  This  goal  is 
being  pursued  by  developing  and  applying  new  techniques  in  the  general  areas 
of  sensitivity  analysis  and  Lie  algebraic  theories.  Due  to  widespread 
applications  of  these  two  analysis  methods,  the  outcome  from  this  research 
program  has  important  implications  to  many  other  problems  arising  in 
combustion  phenomena  as  well  as  to  other  subjects  of  interest  to  the  Air  Force 
(e.g.,  system  control,  parameter  identification  and  distinguishabil ity , 
statistical  parameter  uncertainty,  model  scaling,  etc.). 

WORK  STATEMENT 

The  work  statement  for  this  program  is  as  follows: 

1.  Develop  global  sensitivity  analysis  techniques  using  Lie  algebra  for 
parameter  space  mapping  and  control  of  temporal  systems.  Special  attention 
will  be  given  to  using  the  techniques  for  performing  finite  excursions  through 
parameter  space.  As  the  tools  develop,  they  will  be  applied  to  the  lumping 
consideration  above,  as  well  as  to  design  and  control  problems  relevant  to 
combustion  systems. 

2.  Appropriate  advanced  development  of  item  1  are  planned  to  extend  the 
analysis  procedures  to  more  complex  combustion  chemistry  and  to  include  both 
spatial  and  temporal  calculation  comparisons  of  lumped  and  detailed  models. 

3.  Determine  criteria  for  exact  lumping  in  chemical  kinetics  and  investigate 
the  implications  of  these  lumpability  conditions  to  model  systems. 

4.  Model  systems  will  be  studied  to  establish  the  use  of  elementary 
sensitivity  coefficients,  Green’s  function  elements  and  derived  sensitivity 
coefficients  for  lumping  purposes.  Appropriate  numerical  procedures  will  be 
employed  including  eigenvector-eigenvalue  analysis  and  rank  reduction  of  the 
appropriate  sensitivity  matrices. 


5.  The  sensitivity  techniques  of  item  4  will  be  developed  with 
hydrogen/oxygen  and  carbon  monoxide/hydrogen/oxygen  combustion  systems  as  test 
cases  foi  systematic  model  reduction  and  lumping.  The  ensuing  lumped  models 
will  be  compared  with  those  in  the  literature  for  their  predictive 
capabilities. 

STATUS  OF  RESEARCH 

During  the  past  year  research  on  several  interrelated  activities  was 
pursued  in  the  general  area  of  combustion  kinetics  and  sensitivity  analysis 
particularly  as  related  to  model  lumping  and  reduction.  The  thrust  of  these 
developments  has  largely  been  fundamental  with  the  emphasis  on  creating  and 
ultimately  applying  new  methodology  for  treating  several  critical  problems 
arising  in  combustion  phenomena.  Particular  emphasis  was  given  to  modelling 
and  theoretical  concerns,  although  a  portion  of  the  research  is  directly 
related  to  the  interface  between  laboratory  measurements  and  modelling.  A 
summary  of  these  activities  is  given  below. 

I.  GLOBAL  SENSITIVITY  ANALYSIS:  A  NEW  PERSPECTIVE 

This  research  is  motivated  by  a  number  of  important,  but  heretofore 
difficult  problems,  in  combustion  modelling  including  system  lumping, 
statistical  parameter  uncertainty  and  system  control.  All  of  these  issues  as 
well  as  others  necessitate  obtaining  an  understanding  of  how  the  system 
behaves  over  broad  region  of  its  parameter  space  of  rate  constants,  transport 
coefficients,  etc.  Traditional  methods  for  achieving  this  information  rely  on 
repeated  calculations  at  sample  points  in  the  system  parameter  space.  Such  an 
approach  is  prohibitively  expensive  and  the  results  will  typically  provide 
little  insight  into  the  detailed  workings  of  the  system.  With  this 
information  as  background,  we  have  been  pursuing  an  entirely  new  approach 
based  on  Lie  group  techniques  for  mapping  broad  regions  of  system  parameter 
space.  The  initial  developments  appear  very  promising  as  summarized  below. 

A.  Uniform  Temporal  Reacting  Systems1 

One-parameter  groups  of  transformations  were  used  to  investigate  the 
effects  of  wide-ranging  changes  in  rate  constants  and  input/output  fluxes  upon 
homogeneous  chemical  reactions  involving  an  arbitrary  number  of  species  in 
reactions  of  zero,  first  and  second  order. 

Every  transformation  group  is  so  chosen  that  it  either  exactly  or 
approximately  converts  each  solution  of  a  set  of  rate  equations  into 
corresponding  solutions  of  a  one-parameter  family  of  altered  rate  equations. 
All  of  these  solutions  have  topologically  equivalent  equilibrium  points  and 
topologically  equivalent  phase  trajectories  in  the  space  of  concentration 
variables.  Compounding  the  transformations  yields  transformations  with  the 
same  properties. 

The  chemical  significance  of  the  transformations  was  illustrated  by 
applying  them  to  kinetic  systems  involving  two  reacting  species.  There  are 
then  twelve  separate  one-parameter  groups  of  transformations  available.  The 
generators  of  all  allowed  one-parameter  groups  are  obtained  for  systems  of  N 


species  using  an  algorithm  which  exactly  determines  their  action  on  the  rate 
constants,  and  either  exactly  determines  or  systematically  approximates  their 
action  on  the  concentrations.  The  generators  determine  invariant  functions 
that  establish  relations  between  the  initial  rate  constants  and  the  altered 
rate  constants  and  between  the  initial  concentration  variables  and  the  altered 
concentration  variables. 

Some  mapping  of  the  concentrations  simply  shift  their  values  and  may  be 
used  to  study  the  effects  of  changes  in  input/output  fluxes  and  rate  constants 
upon  concentrations.  Other  mappings  create  "lumped”  concentration  variables 
and  may  be  used  to  systematically  reduce  the  number  of  manifest  concentration 
variables  in  nonlinear  as  well  as  linear  kinetic  equations. 

A  number  of  mappings  of  nonlinear  kinetics  may  be  used  to  obtain 
approximate  linearizations  valid  in  regions  larger  than  those  obtained  by  the 
usual  power  series  expansions.  In  some  cases  the  linearization  is  global  and 
exact . 

B.  Reaction-Diffusion  Systems2 

The  methodology  developed  in  paragraph  A  above  has  an  immediate 
transferral  to  the  more  complex  and  interesting  class  of  reaction-diffusion 
problems.  Although  partial  differential  equations  will  in  general  admit  a 
much  broader  set  of  Lie  group  mappings,  their  full  determination  will  also  be 
more  difficult  to  establish.  However,  in  the  case  that  the  system  diffusion 
coefficients  are  not  concentration  dependent,  exactly  the  same  transformations 
developed  for  the  purely  temporal  reacting  systems  may  be  applied  beneficially 
to  the  case  of  reaction-diffusion.  For  example,  transformations  which 
rigorously,  or  at  least  regionally,  linearize  a  reaction  network  have  exactly 
the  same  effect  on  an  analogous  reaction-diffusion  system.  Application  of 
these  transformations  has  allowed  us  to  analytically  explore  the 
interrelationship  between  diffusive  transport  and  reaction  kinetics.  In 
addition,  traditional  local  methods  of  stability  analysis  may  now  be 
significantly  extended  to  cover  regions  of  the  system  state  and  parameter 
spaces.  We  plan  to  use  these  tools  for  the  treatment  and  analysis  of 
realistic  combustion  models. 

II.  Analytic  Insight  Into  the  Solution  of  Kinetic  Systems3’4 

This  research  has  a  close  connection  with  that  of  item  I  above  in  that  it 
is  also  based  on  the  use  of  Lie  group  theoretical  techniques.  The  goal  here 
is  more  limited  with  the  purpose  largely  being  the  provision  of  analytical 
insight  into  the  solution  of  nonlinear  combustion  kinetic  systems. 

Traditional  numerical  methods  for  this  purpose  can  typically  provide  varying 
degrees  of  accuracy  but  are  quite  inadequate  with  regard  to  their  resultant 
insight.  To  deal  with  this  problem  we  have  developed  a  new  analytic 
approximation  scheme  for  the  finite  Lie  transformation  yielding  the  solution 
to  a  set  of  nonlinear  kinetic  equations.  This  work  also  has  immediate 
applications  to  the  global  parameter  space  Lie  generators  involved  with  the 
study  in  item  I  above. 

In  this  work  a  new  method  to  factorize  certain  evolution  operators  into 


an  infinite  product  of  simple  evolution  operators  is  presented.  The  method 
uses  Lie  operator  algebra  and  the  evolution  operators  are  restricted  to 
exponential  form.  The  argument  of  these  forms  is  a  first  order  linear  partial 
differential  operator.  The  method  has  broad  applications  including  to  the 
areas  of  sensitivity  analysis,  the  solution  of  ordinary  differential  equations 
and  the  solution  of  Liouville’s  equation.  A  sequence  of  £ -approx imants  is 
generated  to  represent  the  Lie  operators.  Under  certain  conditions  the 
convergence  rate  of  the  £ -approximant  sequences  is  remarkably  rapid.  This 
work  presented  the  general  formulation  of  the  scheme  and  some  simple 
illustrative  examples. 

Additional  research  was  carried  out  to  establish  convergence  theorems 
associated  with  its  sequence  of  £ -approxmimants .  The  theorems  presented  give 
vhe  conditions  which  are  sufficient  for  convergence  of  the  sequences, 
nxthough  the  main  emphasis  was  on  convergence  properties  of  the 
one-dimensional  case,  the  generalization  to  multidimensional  cases  is  quite 
straightforward.  Further  development  and  numerical  illustrations  are 
underway. 

III.  Kinetic  System  Identifiability  and  Distinguishability6 > 6 

By  following  the  kinetics  of  a  reaction  through  the  use  of  certain 
classes  of  measurable  quantities  instead  of  the  concentrations  of  all  species 
neither  the  parameter  values  nor  the  reaction  scheme  are  necessarily  unique. 
Identifiability  deals  with  the  problem  of  determining  whether  an  experiment  is 
able  to  supply  the  desired  information  on  the  parameters  of  an  assumed  kinetic 
model,  whereas  indistinguishability  means  that  two  different  reaction  schemes 
generate  the  same  values  for  the  observed  quantities  in  any  possible 
experiment.  This  work  examined  these  issues  for  the  case  of  first-order 
reaction  systems  and  both  problems  are  solved  by  the  same  analytical  tools. 

The  method  involving  Laplace  transforms  is  conceptually  simple,  easy  to  apply, 
and  is  also  used  to  derive  simple  rules  to  test  distinguishability  of  reaction 
schemes.  Another  approach  based  on  similarity  transformations  is  used  to 
generate  all  the  first-order  reaction  schemes  that  are  indistinguishable  from 
c  given  one.  These  same  concepts  have  been  extended  to  nonlinear  systems  for 
the  case  of  global  identifiability. 

IV.  A  General  Analysis  on  Exact  Lumping  in  Chemical  Kinetics7 

A  general  analysis  of  exact  lumping  has  been  developed.  This  analysis 
can  be  employed  to  any  reaction  system  with  n  species  described  by  a  set  of 
first-order  ordinary  differential  equations  dy/dt  =  f(y),  where  y  is  an 
n-dimensional  vector;  f(y)  is  an  arbitrary  n-dimensional  function  vector. 

Here  we  only  consider  lumping  by  means  of  a  rectangular  constant  matrix  M 
(i.e.,  y  lower  dimension  than  y  with  y  =  My).  It  is  found  that  a  reaction 
system  is  exactly  lumpable  if  and  only  if  the  intersection  of  the  invariant  or 
the  null  subspaces  of  the  Jacobian  matrix  J(y)  of  f(y)  for  all  values  of  y  is 
nonempty.  If  the  intersection  is  less  than  n,  nontrivial  lumping  schemes  can 
be  obtained.  It  is  proved  that  the  Jacobian  matrix  can  be  represented  as  a 
linear  combination  of  certain  matrices  and  the  intersection  of  the  invariant 
or  null  subspace  of  the  constant  matrices  is  just  that  of  the  Jacobian  matrix. 
After  the  determination  of  the  intersections,  all  possible  lumping  matrices 


can  be  obtained.  The  kinetic  equations  of  the  lumped  system  can  be  described 
as  dy/dt  =  Mf(My),  where  M  is  any  'generalized  inverse  of  M  satisfying  W  =  1^,. 
Several  implications  of  these  lumpability  conditions  were  investigated  as  well 
as  illustrated  by  some  simple  examples. 

V.  Hydrogen-Air  Combustion  Revisited  Under  a  Variety  of  Conditions8 

Surely  the  hydrogen-air  combustion  system  has  received  the  closest 
scrutiny  both  theoretically  and  experimentally.  Yet,  there  is  still  a  lack  of 
complete  understanding  about  which  aspects  of  the  system  are  important 
particularly  under  a  variety  of  laboratory  conditions.  In  order  to  further 
understand  this  issue,  we  have  carried  out  modelling  and  sensitivity  analysis 
studies  under:  a)  purely  temporal  and  isothermal  conditions,  b)  purely 
temporal  and  adiabatic  conditions,  c)  steady  premixed  adiabatic  conditions. 
This  triad  of  studies  provides  an  interesting  hierarchy  allowing  us  to 
understand  the  role  of  diffusive  transport  as  well  as  thermal  coupling.  One 
motivation  for  this  work  arose  from  previous  studies  showing  extensive  scaling 
and  self  similarity  behavior  amongst  the  hydrogen-air  sensitivity 
coefficients.  This  behavior  has  significant  implications  for  model 
simplification  both  in  this  system  as  well  as  complex  combustion  problems.  It 
had  been  previously  speculated  that  the  temperature  was  providing  the  dominant 
coupling  to  produce  self  organization  amongst  all  the  system  species  and  rate 
parameters.  This  study  has  confirmed  this  conjecture  as  well  as  revealed  a 
number  of  other  underlying  subtleties  including  the  role  of  diffusion.  In 
addition,  it  was  shown  that  the  presence  of  strong  scaling  and  self  similarity 
in  the  premixed  flames  allowed  for  kinetic  model  simplification. 

VI.  Model  Reduction  and  Lumping  of  Carbon  Monoxide  Oxidation  Kinetics 

In  our  previous  studies,  normalized  sensitivity  coefficients, 

Sj  =  88.n  Ci/dftnaj,  have  been  studied  to  determine  the  relative  importance  of 
elementary  reactions  or  certain  groups  of  reactions  in  comprehensive 
mechanisms.  We  have  presently  extended  this  methodology  by  using  the 
principal  component  analysis  method  of  Vajda  and  Turanyi  [J.  Phys.  Chem. , 

March  1986J  in  order  to  systematically  reduce  the  size  of  the  original 
comprehensive  mechanism.  Briefly,  this  methodology  is  based  on  a  least 
squares  fit  approach  by  first  defining  the  response  function,  Q,  as 


q  m  C, (tj ,a)  -  C, (tj ,a*)  2 

Q  (a)  =  I  I  [ - F" - ] 

j=l  i=l  C,(tJ,S> 


where  a*  is  the  nominal  values  of  the  parameters,  then  by  introducing  the 
classical  Gauss-approximation  to  yield  Q(a)  =  Q(a)  =  (Aa)T  £T£(Aa)  where 
aj  =  Jlnotj  ,  and  finally  by  performing  an  eigenvalue-eigenvector  decomposition 
on  the  resulting  cross-product  matrix  STS.  Eigenvectors  corresponding  to 
small  eigenvalues  :  'dicate  unimportant-reactions ,  thereby  enabling  one  to 
optimally  reduce  the  mechanism. 


Along  these  lines  we  have  continued  our  previous  work  by  applying  this 
methodology  to  the  CO/Ra /O2  reaction  mechanism.  A  large  number  of  isothermal 


temporal  problems  were  numerically  run  to  generate  a  data  base  which  would  be 
representative  of  combustion  environments.  The  data  base  covered  a 
temperature  range  from  800  to  1800  K,  several  equivalence  ratios  from  lean  to 
rich  conditions,  and  several  pressures.  The  principal  component  analysis  was 
applied  to  this  data  base  to  determine  the  minimum  reaction  set  that  would 
reproduce  all  the  original  species  concentrations  within  2 %. 

The  results  showed  that  the  original  52  reaction  mechanism  could  be 
successfully  reduced  to  one  consisting  of  27  reactions  while  retaining  all  12 
species  in  the  model. 

Obviously,  this  reduction  is  still  not  practical  for  use  in  large 
multidimensional  codes.  The  necessary  further  reductions  are  proposed  to 
proceed  along  several  directions.  First,  the  constraint  of  retaining  all 
species  will  be  lifted.  Our  earlier  research  has  shown  that  in  addition  to 
the  major  reactants  and  products,  two  intermediate  species  are  necessary  in 
the  model.  Secondly,  we  have  also  found  that  in  more  complex  environments, 
such  as  adiabatic  premixed  flames,  the  underlying  chemical  processes  are  much 
more  coupled  (namely  through  the  heat  release  of  the  reaction)  and  hence,  such 
problems  are  anticipated  to  be  more  directly  lumpable. 
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ABSTRACT 


This  work  concerns  convergence  theorems  associated  with  a 
sequence  of  J  -  approx imants  for  exponential  evolution  operators 
with  Lie  operator  arguments.  A  companion  paper  presents  the 
formulation  of  the  £  -  approx imants .  The  theorems  presented  in 
this  paper  give  the  conditions  which  are  sufficient  for  conver¬ 
gence  of  the  sequences.  Although  the  main  emphasis  will  be  on 
convergence  properties  of  the  one-dimensional  case,  the  generali¬ 
zation  to  multidimensional  cases  is  quite  straightforward. 


INTRODUCTION 


I . 

In  a  companion  paper  we  developed  a  new  factorization 
scheme1  for  exponential  evolution  operators  with  Lie  operator 
arguments.  The  factorization  was  based  on  ordering  of 
contributions  to  the  evolution  operator  with  respect  to  devi¬ 
ations  from  a  steady-state  solution.  Hence,  in  the  Lie  operator 
of  the  form  f(x)'V  the  function  f(x)  must  vanish  around  the 
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origin  x=0.  The  factorization  scheme  results  in  an  infinite 
product  of  elementary  evolution  operators  and  the 
approximation  to  the  desired  overall  evolution  operators  is 
achieved  by  a  truncation  of  the  infinite  product  to  order  n. 

This  procedure  produces  a  sequence  of  I  -  approx imants 
to  the  desired  evolution  operator.  The  effect  of  the  Lie  trans¬ 
formation,  or  its  approximate  representation,  on  the  position 
vector  x  Is  fundamental  in  the  theory  since  many  of  the  basic 
operations  may  be  related  to  certain  properties  of  Lie  trans¬ 
formations.  A  simple  first  order  recursion  relation  may  be 
found  for  the  ?  -  approximants ,  however  they  are  rich  in 
singularities.  As  the  limit  of  the  sequence  of  the  J  -  approxi¬ 
mate  is  taken,  infinitely  many  branch  point  trajectories 
may  exist  in  the  complex  x-plane.  The  flexibility  inherent 
in  the  £  -  approximants . suggests  that  this  approach  may  rapidly 
converge  to  accurately  approximate  the  effect  of  the  evolution 
operator  on  x.  This  conjecture  was  confirmed  in  a  number 
of  applications1  although  certain  cases  exhibited  slow  or 
non-convergent  characteristics. 


Such  empirical  evidence  is  helpful,  but  a  mathematical 
proof  of  the  convergence  behavior  is  needed  in  order  to 
intelligently  use  the  method  in  realistic  applications.  It 
is  necessary  to  establish  not  only  the  existence  of  convergence 
but  also  determine  the  criteria  under  which  convergence  is 
expected.  The  purpose  of  this  paper  is  to  address  these  latter 
issues . 

In  order  to  mathematically  explore  the  convergence 
characteristics.  Section  II  will  investigate  the  singularities 
of  the  ?  -  approximants .  This  section  will  also  define  some 
useful  fundamental  concepts.  Section  III  will  present  the 
convergence  theorems  for  the  ?  -  approximant  sequences.  These 
latter  developments  will  be  carried  out  for  the  one-dimensional 
case  and  Section  IV  will  generalize  the  theorems  to  the  multi¬ 
dimensional  case.  Finally,  Section  V  presents  concluding 
remarks . 


Singularities  of  I  -  Approximants  and  Some  Fundamental 
Definitions  in  the  One  Dimensional  Case 


II . 


The  evolution  operator  of  concern  has  the  form  Q=exp|f 
where  f(x)  is  a  specified  function  defining  the  Lie  Operator. 

The  action  of  Q  on  x  is  approximated  by  a  sequence  of  X  -  approx- 
imants,  Qx  »  ?n  such  that 


fn+i  ~ 


In 


1  -  non+1 ( t)  ?R]l/n 


?i(x,t)  =  x  exp(fAt) 

(II. l) 


where 


Q  ■  exp(tf (x)  |x]  =  n  exp [°j ( )  xl|x] 

J  =  i 


and 


(II- 2) 


f (x)  =  xk 

K  =  1 


f  k  Xk 

(II. 3) 

The  coefficients  o5(t)  in  each  of  the  elementary  exponential 
operators  in  Eq.  (I I. 2)  are  global  functions  of  time.  The 
evaluation  of  these  coefficients  establishes  the  terms  of  the 
recursion  relation  in  Eq.  (II.l)  and  the  details  of  this 
operation  were  presented  in  an  earlier  paper1.  The  iteration 
in  Eq.  (II.l)  may  be  written  in  explicit  form  as 


la(x,t)  =  - r 

».<«•<>  -  .a-.,: 


(II. 4) 
(II- 5) 
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ln(x,t)  =  [  [...  [l-Oa  x]2-  03  x*]^ 

in-i  i  1 

On-i  *n-2jin  -  „n  ,n-ij-R 


-  ..  «•]* 


•  X  exp  (fxt) 


where 


Sn+itt)  =  non+1(t)  exp(nf j t ) 


(II. 6) 


(II. 7) 


The  structure  of  n*ay  be  identified  as  a  type  of  continued 
fraction . 

The  origin  in  the  complex  x-plane  is  not  a  singular  point 
for  all  the  fn's  as  long  as  t  remains  finite.  Since 
Oj(o)  -  0j(o)  =  o,  all  singularities  of  the  approximants  are 

gathered  at  infinity  at  the  initiation  of  the  evolution.  Each 
singularity  moves  along  a  trajectory  in  the  complex  x-plane 
as  time  evolves  and  may  or  may  not  reach  the  origin  when  t  tends 
to  infinity.  As  a  specific  example  we  will  now  examine  the 
second  approximant,  ?2(x,t).  This  approximant  has  a  rather 
simple  singularity,  a  pole,  whose  location  is  given  as  follows: 

Xp  *  ■» — — rr  ~n — ~rr  (II-8) 


where  we  have  made  use  of  the  formula 


oa(t)  *  yi  <i-*xp(-f it) ) 


(II. 9) 


Since  the  expansion  coefficients  fn  are  assumed  to  be  real, 
the  pole  in  Eq .  (II. 8)  is  evidently  located  on  the  real  axis 
of  the  complex  x-plane.  The  pole  starts  to  move  from  either 


i» ,  r  ,  \ 
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+  «*  (f 2>o)  or  -  ■  (f2<o)  to  a  limiting  point  as  time,  t,  tends  to 
infinity.  If  f2  =  o,  the  pole  remains  at  infinity.  In  general, 
two  different  cases  occur  as  time  evolves  assuming  that  f2«*o. 


1  ^  \  — 
t  =<*>  *p  \  »  ) 


LlI 


fj  >  o 
f  I  <  0 


(II. 10) 


It  is  apparent  from  Eq.  (I I. 10)  that  if  the  system  under  con¬ 
sideration  is  unstable,  >  o,  then  the  trajectory  of  the 
singular  point  ends  at  the  origin.  However,  if  the  system  is 
stable,  f !  <  o,  then  the  singular  point  stops  at  a  finite 
location  away  from  the  origin  on  the  real  axis.  Therefore, 
at  least  for  this  approximant,  there  is  a  "clean"  region  where 
a  singularity  can  never  appear  if  ),  <  o  and  the  origin  of  the 
complex  x-plane  is  an  interior  point  of  this  clean  region. 

If  the  system  under  consideration  is  unstable,  the  origin  may 
again  be  included  in  this  clean  region,  however,  in  this  case  it 
becomes  a  point  located  on  the  border  of  the  clean  region. 

In  order  to  gain  further  insight  into  the  ?n  -  approximants , 
we  shall  now  examine  the  next  approximant.  J3.  This  approximant 
has  four  branch  points,  two  of  which  are  located  at  infinity  and 
the  remaining  ones  are  given  below  (where  3a  >  o ,  otherwise 
branch  points  are  complex) . 


=  I02(t)  +  t0j(t)]"3 


X2  »=  1 02  (  t  )  -  to3(t)]^)-1 


(II. 11) 


These  singularities  are  algebraic  branch  points  with  two 
Riemann  sheets.  Depending  on  the  nature  of  the  system, 
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022  -  03  may  be  positive,  zero  or  negative.  If  it  differs  from 
zero,  then  the  origin  becomes  an  interior  point  of  the  clean 
region  for  this  approximant. 

There  is  a  remarkable  property  about  the  X  -  approx imants 
which  can  be  stated  as  follows.  If  has  a  singularity  which  is 
a  branch  point  (except  for  the  case  j.=  2),  then  every 
£k  -  approximant  (k>j)  will  have  the  same  singularity.  This 
means  that  when  j  tends  to  infinity  there  will  be  an  abundance  of 
branch  point  trajectories  in  the  complex  x-plane.  Any  given 
trajectory  may  or  may  not  be  in  the  clean  regions  in  the  complex 
x-plane.  As  we  shall  see,  the  proof  of  the  convergence  of  the 
X  -  approximant  sequences  completely  depends  on  the  existence  of 
these  regions  and  their  locations. 

It  is  now  useful  to  make  some  definitions  before  proceeding. 

A  given  system  is  ultimately  prescribed  by  the  behavior  of  the 
function  f(x)  describing  the  corresponding  Lie  operator.  If  the 
complex  x-plane  of  such  a  system  has  a  region  where  any  portion 
of  the  branch  point  trajectories  of  the  X  -  approximants  never 
exist  there,  then  we  shall  call  this  region  a  "clean  region"  in 
accord  with  the  use  of  these  words  above.  If  additionally, 
this  region  includes  the  origin  of  the  complex  x-plane  as  an 
interior  point,  then  this  region  will  be  called  the  "main  clean" 
region  of  the  system.  We  further  define  a  "global  normal"  system 
as  follows:  i f  f  a  system  described  by  f(x)  has  a  main  clean 
region  with  a  non  zero  measure  it  is  a  global  normal  system 
where  we  have  used  a  measure  in  the  sense  that  the  measure  of  any 
countable  infinite  set  vanishes.  This  latter  measure  is  employed 
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to  exclude  the  possibility  of  having  a  clean  region  which  only 

includes  the  origin.  The  interpretation  of  this  definition  of  a 

global  normal  region  can  be  made  as  follows:  if  we  deal  with  a 
finite  period  of  time  then  the  system  will  apparently  have  a  main 

clean  region.  If  we  denote  this  region  by  R(t)  then  we  can  write 

R(t)  =  R8  =>  to]  m(Rs)  >  o  (11.12) 

In  other  words,  the  main  clean  region  will  continue  to  have 
an  infinite  number  of  uncountable  points  around  the  origin 
when  time  tends  to  infinity  if  the  system  is  global  normal. 

This  definition  may  be  relaxed  by  limiting  ourselves  not  just 
to  a  semi-infinite  time  period,  but  to  a  finite  one  starting 
from  t  —  o .  Therefore,  we  can  define  the  "temporary  normal" 
system  as  follows:  a  system  described  by  f(x)  is  temporary 
normal  i i i  it  has  a  main  clean  region  with  a  non  zero  measure 
(m)  for  a  given  time  period  [0,TJ.  Here,  m  is  defined  again 
in  such  a  way  that  the  measure  of  every  finite  or  countable 
infinite  set  is  zero.  Finally  all  remaining  systems  will  be 
•abnormal".  As  can  be  observed  all  global  normal  systems  are 
at  the  same  time  temporary  normal,  and  all  abnormal  systems  can 
be  considered  as  a  limiting  case  (T+0)  of  temporary  normal 
systems . 
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Convergence  Theorems  in  the  One -Dimensional  Case 


From  an  examination  of  Eqs.  (II. 1) ,  (II. 5-7)  we  may  rewrite 


the  approximante  (x,t)  as  follows: 


fi  (x,t) 


x  exp^t) 
Aj (x,t) 


(III.l) 


The  function  Aj(x,t)  in  the  denominator  satisfies  the  recursion 


relation 


ln(x,t)  =  [ARli(x,t)  -  0n  xn_1]n_1  *  =  1 


(III. 2) 


One  may  conclude  from  this  relation  that  the  serial  representa¬ 


tion  of  An(x,t)  in  positive  integer  powers  of  x  with 


time -dependent  coefficients  will  converge  within  a  finite  circle 


of  non  zero  radius  around  the  origin  of  the  complex  x-plane  for 


some  time  period  CO,T).  One  can  then  construct  a  Majorant 


series  for  this  function  such  that 


IAn(x,t)l  <  Dn(x,t)  ;  Dp  >  1  ;  Ixl  <  pn(t) 


(III. 3) 


where  pn(t)  denotes  the  time  dependent  convergence  radius  of  the 


Majorant  series.  The  expression  for  the  bound  Dp  may  be 


established  as  follows 


|aR(x,0  |  <  DE(x,t)  => 

|/£(x,t)  -  Bn*i  xn|  <  1  aJJ  ( x » t )  |  ♦  |on+1|  |x|n 

<  ( x ,  t )  ( 1+ 1 Sn+  i  I  |xn|  j 

->  Un+il  <  DnU.t)  (l'*’  I  ®n+ 1 1  M")  »> 

Dn+i (x*  t )  =  Dn(x,t)  (l+  | dnt 1 1  M") 


(III- 4) 


r  t  i  u.a  a m  irxxvruirL  »ru  srv  xv  ww  tp.i  w/ wv 


'  .*  >V  *  <  *,»  *|»  \t* 


This  latter  result  implies  that 


(l*|0|lt)|xHt>]  (III. 5) 

If  the  infinite  product  in  Eq.  (III.  5)  is  convergent,  this  is  the 
region  of  the  complex  x-plane  defined  by  lx l  <  p(t)  and 

P( t)  >  Pm in  >  o  (III. 6) 

for  all  t -values,  then  Dbo(x,t)  will  converge  to  a  finite  value. 
This  result  also  implies  that  the  function  A«(x,i)  will  con¬ 
verge  for  all  t -values  in  a  region  defined  by  lx  I  <  pmjn. 

The  existence  of  such  a  convergence  implies  that  the  zeros  of  the 
function  An(x,t)  in  the  complex  x-plane  are  bounded  from  below 
in  absolute  value  for  all  times.  This  in  turn  means  that  the 
system  is  global  normal. 

The  condition  for  convergence  of  the  infinite  product  in 
Eq.  (III. 5)  is  equivalent  to  establishing  the  convergence  of  the 
following  expression 

OD 

dN(X'°  =  H  1  °H  +  i  1  |X|N  +  j  (III. 7) 

j  =  i 

If  this  sum  converges  and  remains  smaller  than  unity  for 
sufficiently  large  n  values,  then  the  infinite  product  in 
Eq.  (III. 5)  also  converges.  If  proin  in  Eq.  (III. 6)  vanishes, 
then  two  circumstances  may  occur: 

1)  P(t)  >  Pmin  (T)  >  0  t  €  (0,TJ  (III. 8) 

ii)  p(t)  >  Plain  (T)  Pmin(T)  =  0  (except  T=0)  (III. 9) 

The  first  of  these  cases  corresponds  to  a  temporary  normal 
system,  while  the  second  implies  the  abnormal  case.  We  have 
therefore  proved  the  following  theorem. 


Theorem  I : 


If  the  following  infinite  sum 


d(xft)  =  )  |  Oj  |  xi  (III. 10) 

converges  in  a  circle  around  the  origin  of  the  complex 
x-plane  lxl<p(t),  then  the  following  statements  hold 

(i)  if  p(t)  >  Pmin  >  0  for  t  c  t0,«0, 
the  system  is  global  normal 

(ii)  if  p(t)  >  Pmin(T)  >0  for  t  €  t0,r] 

with  r > 0 >  the  system  is,  at  least,  temporary  normal 

Corollary  I 

If  the  first  condition  (i)  of  Theorem  I  holds,  then  the 
sequence  of  l  -  approximants  converges  for  all  x 
and  t  values  in  the  regions  (-pnun>  Pmin)  and 
CO,*)  respectively,  and  they  have  a  permanent 
main  clean  region  with  non-zero  measure. 

Corollary  II 

If  the  second  condition  (ii)  of  Theorem  I  holds,  the 
sequence  of  ?  -  approximants  converges  at  least  for 
all  x  and  t  values  in  the  regions  (-pmin(T),  Pmin(T)) 
and  [0,r],  t  >  0  respectively  and  they  have  at  least 
a  temporary  clean  region  around  the  origin  of  the  complex 
x-plane . 
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We  now  seek  to  more  explicity  express  the  relation  between 
the  convergence  condition  of  d(x,t)  and  the  nature  of  the 
system.  As  derived  in  the  companion  paper1,  the  o  -  coefficients 
are  described  as 

*n+i(t)  =^n(0)  (HI-11) 


with 


n+i(x)  = 


1 

ontl  *]-*  )  -#n(0) 


(111.12) 

(111.13) 


and 

oo 

^x(X)  =  y  f  j  exp  (-(3  +  1)0!  xJ) 
j=o 

where  the  time  dependence  of  the  &'b  is  not  shown  explicitly 
Now,  if  we  assume  that  f(x)  converges  for  lxl<  (p<>o),  we 
can  write  the  following  inequality 


lf1+2l  4  —  (III. 14) 

pi 

This  relation,  however,  permits  us  to  construct  the  following 
Majorant  function  for 

A* exp(“Oj ) 


Mx(x)  = 


1  - 


x  exp(-Oj) 
P~i 


Ixl  <  p,  exp(Oj)  -  Pi  exptfjt) 


(III. 15) 

Let  us  now  assume  that  we  have  found  a  Majorant  function  for 
£Fn  as  follows 


i  n" 


j  =  o 


<  On 


(III . 16) 
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where  ^  stands  for  time-dependent  coefficients  and  An,  pn 

denote  certain  time -dependent  constants.  The  last  assumption, 
however,  makes  it  possible  to  write  the  following  expression  for 
the  Majorant  function  of  aB  can  b®  revealed  after  a 

careful  examination  of  the  recursion  given  by  Eq.  (II I. 12) 


_  1 

Wufxli-nlOn-jHxl0]  n  | 
,+  i(x)  >  -2J - _ J 


(III .18) 


If  we  use  the  expression  of  Mn  given  by  Eq.  (III. 17),  we  can 


write 


Mn+ 1 (x)  > 

1-  n|on+1 1  +  —  -  x 


(III .19) 


where 


Gn(x)  -  8i,n(x)  /  8a,n(x) 


(III .20) 


n-  j. 

8i,n(x)  =  ( 1-n I °n+ i|xn]1-(j  +  l)/n  p")  XJ  (III. 21) 


mi  j 

8a,n(x)  =  |n  I  °nt  i  I  +  ^Jn  x> 


(III .22) 


Since  Gn(o)  =■  i  and  Gn(x)  is  a  monotonic  decreasing  function  of 
x,  we  can  construct  the  following  Majorant  function  for  the  right 
hand  side  of  Eq.  (III. 18) 


**n+i(x)  - 


l1  - 


(III. 23) 


where 


a  -  An 
*ntl  ■ 


■  Af  exp(-fxt) 


(III .24) 


I 
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—  =  (n|on+x|  +  — In 

1  p"J  J  =  Pf 


exp(f jt) 


(III .25) 


If  we  assume  that  (nton+1j)n  exp(fjt)  is  bounded  by  v,  then  we 


can  write 


«n  <  exp(-fjt)  pn 


(III .26) 


«n+i  = 


«i  =  Pf 


(III .27) 


l  +  v  a 


Therefore,  we  have  made  the  convergence  radii  of  the  Majorant 
functions  smaller.  As  can  be  easily  shown  after  some  inter¬ 
mediate  steps,  an  monotonically  converges  to  a  nonzero  limit, 
say  a,  as  n  tends  to  infinity.  This  makes  it  possible  to  write 


Bn+i  - 


Bn  exp(-fjt) 


B  i  —  A  i 


(III .28) 


Mn  - 


(i  -  e*P<-fit)  2-) 


(III .29) 


Since  6n  <  Mn+ i  we  can  obtain 


.  .  A*  r l  -  exp(-nf,t)i 

^  [ - n? !  1  'I  " 


(III .30) 


which  obviously  satisfies  the  boundedness  condition  of 
(nlon+1l)n  exp(fjt)  globally  for  fj  <  o  and  temporarily  for 
fj  >  o.  This  result  immediately  produces  the  following 


theorem. 


Theorem  II:  If  the  descriptive  function  of  a  given  system  is 

denoted  by  f(x),  (f(o)=o),  then  the  following  statements  are 
true . 


(i)  if  f(x)  has  a  finite  convergence  radius  centered  at  the 
origin  of  the  complex  x-plane  and  fj<o,  then  the  system 
is  global  normal. 

(ii)  if  the  same  conditions  of  case  (i)  hold  except  that 
fx>o,  then  the  system  is  at  least  temporary  normal. 

Our  third  theorem  concerns  the  -  approximants .  Let  us  consider 
the  inverse  relation  between  ?n  and  fn+i. 


?n  = 


’n+i 


i+non+1  fR+1]n 


(III .31) 


If  we  write 


v  ’  rolnt[*  „wl)R  ). 


and  if  the  following  holds  for  a  specific  n 
'?n+i(*iO  1  <  vn+a  <  v 


then. 


‘In'  < 


n+i 


[1  -  •  non+!  I  vR+jJ  n 
Now  one  can  choose  vn+1  in  a  way  such  that 


(III . 32) 


(III. 33) 


(III . 34) 


n+i 


n 


i  -  |non+1|vn+1n]n 


i  +  |nan+1|vR]n 


1  <  v 


n 


>  vn+ 1 


where  vn  is  defined  as  below 

Hn(x,t) I  <  vn  <  v  (III. 36) 

Therefore  we  conclude 

Theorem  III:  If  we  denote  the  mimimum  of  the 
expression  (fell*  n*i,j,..  by  v,  and  for  a 

finite  fixed  n,  the  approximant  remains  smaller 
than  v  in  absolute  value,  then  all  higher  order 
approximants  behave  in  the  same  way. 

The  interpretation  of  this  theorem  is  as  follows.  If  the  system 
is  globally  normal  then  the  limit  of  the  sequence  of  approximants 
f(x,t)  =  ?n  will  remain  permanently  in  the  main  clean 

region. 

In  the  proofs  of  these  theorems  we  assumed  that  f(x)  is 
a  real  function  and  x  is  a  real  variable.  We  did  this  for  the 
sake  of  simplicity.  However,  if  f(x)  and  x  are  assumed  to  be 
complex  quantities,  nothing  will  change  except  the  replacement 
of  fx  with  R  exp(fj)  and  changing  the  intervals  into  the 


circles . 
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IV.  GENERALIZATION  TO  THE  MULTIDIMENSIONAL  CASE 

In  the  companion  paper1  establishing  the  approximants  it  was 
noted  that  there  is  a  degree  of  flexibility  in  the  order  of  the 
elementary  factors  or  propagators  associated  with  a  multi¬ 
dimensional  Lie  transformation.  A  convenient  ordering  for  the 
proof  of  convergence  can  be  written  as  follows 


Q  =  exp( tf  (x)  *v)=exp( t  x^f1^1)  -?) 


"pH'M  §jJ 


(IV. 1) 


where  '  depends  on  xn's  except  xN  and  t. 


We  have  chosen  an  ordering  of  a  product  of  elementary  exponential 
operators  such  that  the  differentiation  with  respect  to  xN  is 
effected  first.  This  ordering  has  a  practical  implication  if  we 
consider  the  effect  of  Q  on  xx,  in  which  case  the  last  (n-x) 
curly  bracketed  operators  reduce  to  unity  due  to  the  fact  that 
they  have  no  effect  on  xx 


Qxx  -  expC  t  xT.fT0) 


■VJ  {Ho  1^)]  X. 


Similarly,  if  we  deal  with  Qxj ,  then  we  can  choose  the  ordering 
or  the  curly  bracketed  operators  in  a  way  such  that 


expt  t  xT.fT(i) -v] 


I77)} 


(IV. 2) 


can  be  written.  Such  changes  of  ordering  will  alter  the  Mj'b 
and  without  any  loss  of  generality  we  may  consider  the  particular 
ordering  in  Eq.  (IV. l). 

To  find,  for  example,  Mo* ^ (*2 *  • • -*n * ' )  we  can  obtain  a 
partial  differential  equation  which  must  satisfy 


5t 


-  41’ 


k2  »  • 


’*N 


N 


)  =2 


*2 


37 


U) 


£ 

j 


(IV. 3) 

where  fj  denotes  the  new  descriptive  vector  element  of  the 
system  after  extraction  of  its  linear  response.  This  may  be 
equivalently  stated  as 


l£(x) lx=o  =0  ’  {Vfi}lxl=o=0 

(IV. 4) 

The  same  equations  are  assumed  to  hold  for  Mq1^ 

MS°[o,o,...,t]=o  ,  {**.), K,.0«o 

(IV. 5) 

since  first  degree  terms  are  excluded  by  extraction  of  the 
linear  response.  Hence,  Eq.  (IV. 3)  may  be  solved  by  a 
multi-dimensional  Taylor  series  with  the  initial  condition 

MoX)  [x2, . . . ,xN,o]=o  (IV. 6) 


The  convergence  of  such  series  have  been  thoroughly  investigated 
in  the  theory  of  partial  differential  equations2.  Therefore, 

Mo 1 ^  *  ®nd  the  other  m's  which  satisfy  the  same  kind  of 


partial  differential  equations  can  be  assumed  convergent  and 
bounded  in  a  closed  domain  around  the  (n-i) -tuple  manifold  formed 
by  the  cartesian  product  of  the  x2 , . . . ,xM-complex  planes. 

In  analogy  with  the  previous  section  one  may  prove  theorems  about 
the  convergence  properties  of  the  sequence  of  approx imants  gener¬ 
ated  by  truncating  the  product  of  operators  in  Eq.  (IV. 2).  These 
same  type  of  statements  follow  as  before  except  through  a  change 
of  the  x-plane  into  an  n-tuple  manifold  constructed  by  the 
cartesian  product  of  the  n-complex  planes  (Xj-plane , . . . , 
xN-plane) . 


V.  CONCLUDING  REMARKS 


* 


* 


* 


In  the  first  of  these  two  papers  we  presented  a  factorization 
scheme  for  Lie  transformation  evolution  operators  and  in  the 
present  paper  we  have  given  sufficient  conditions  for  the  con¬ 
vergence  of  the  scheme.  Under  appropriate  circumstances,  these 
approximants  form  a  practical  tool  to  produce  a  rapidly  con¬ 
vergent  and  high  precision  approximation  to  the  original 
evolution  operator.  These  new  approximants  are  also  richer 
than,  for  example,  Pade  approximants  for  numerical  analysis. 

This  comment  follows  due  to  the  abundance  of  branch  points 
which  make  it  possible  to  characterize  many  types  of  functions 
having  various  types  of  singularities.  These  two  papers  are 
actually  only  the  first  step  in  the  theoretical  development 
of  these  new  types  of  approximants  and  much  additional  research 
needs  to  be  done  for  their  deeper  understanding  and  to  bring  them 
to  a  truly  practical  level. 
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Abstract 

In  this  work  a  new  method  to  factorize  certain  evolution 
operators  into  an  infinite  product  of  simple  evolution  operators 
is  presented.  The  method  uses  Lie  operator  algebra  and  the 
evolution  operators  are  restricted  to  exponential  form.  The 
argument  of  these  forms  is  a  first  order  linear  partial 
differential  operator.  The  method  has  broad  applications 
including  to  the  areas  of  sensitivity  analysis,  the  solution  of 
ordinary  differential  equations  and  the  solution  of  Liouville's 
equation.  A  sequence  of  {-approximants  are  generated  to  repre¬ 
sent  the  Lie  operators.  Under  certain  conditions  the  convergence 
rate  of  the  { -approx imant  sequences  is  remarkably  high.  This 
work  only  presents  the  general  formulation  of  the  scheme  and  some 
simple  illustrative  examples.  Investigation  of  convergence  prop¬ 
erties  is  given  in  a  companion  paper. 
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I .  Introduction 

In  this  paper  a  system  with  n-degrees  of  freedom  will  be 
characterized  by  n  variables,  x1,...,xn,  which  form  real  Euclidean 
space.  If  any  two  points  in  the  space  are  related  by  a  unique 
transformation  Q  whose  functional  structure  does  not  depend  on  the 
location  of  the  points,  then  one  can  define  an  evolution  operator 
for  the  system.  Since  any  two  points  of  an  n-dimensional  space 
may  be  connected  by  a  continuous  curve,  it  is  possible  to  use 
a  tracing  parameter  which  defines  the  position  of  the  system 
on  this  curve  during  its  evolution  from  its  initial  state  x,  to 
its  final  state  x*.  This  circumstance  often  arises  where  time  is 

«v  * 

the  evolutionary  parameter  and  we  will  accordingly  denote  the 
parameter  as  t.  Therefore,  the  initial  and  final  states  of  the 
system  can  be  character ized  by  the  scalar  instants  of  time  t,  and 
tf.  Hence  the  evolution  operator  Q  can  be  represented  as 
Q(tf,t1)  and 

=  W  *  (I1) 

where  the  dot  is  used  to  symbolically  represent  the  effect  of 
Q  on  x, .  In  many  applications  one  can  find  practical  expressions 
for  the  operator  Q  if  t1  and  tf  are  sufficiently  close  to  each 
other.  Hence,  the  global  evolution  operator  Q(tf,t,)  may  be 
factorized  into  a  simple  sequence  of  evolutionary  steps 

Q(tf,t,)  =  Q(tf,t.)  •  Q(t.,t._1) . . -Q(t1,t1 )  (1.2) 

and  by  choosing  m  sufficiently  large  this  factorization  can 
characterize  the  global  evolution  of  the  system.  If  the 
simple  short  time  interval  solutions  were  exactly  calculable, 
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then  the  presence  of  a  large  number  of  such  evolutions,  m,  is 
not  important.  However,  in  reality,  even  the  simple  evolutions 
over  the  short  time  intervals  can  often  be  only  approximately 
determined.  In  such  a  case,  the  number  of  increments  m  is 
significant  since  errors  can  accumulate  to  possibly  create 
numerical  instabilities  and  inaccuracies .  In  addition,  the 
factorization  requires  operators  at  times  other  than  the  initial 
and  final  specified  values.  Therefore,  a  more  global  factorization 
of  the  evolution  operator  such  as  suggested  in  this  paper  would  be 
more  attractive. 

The  present  work  considers  the  factorization  of  the 
evolution  operator  into  a  sequence  of  simple  global  evolution 
operators.  The  scheme  presented  will  maintain  its  validity  only 
on  a  special  subclass  of  evolution  operators.  First,  we  restrict 
the  system  under  consideration  to  being  autonomous  such  that  the 
evolution  operator  has  the  following  simple  structure 


Q(tf,t1)  =  Q(tf-t,) 


(1.3) 


We  also  restrict  ourselves  to  autonomous  evolution  operators 
having  an  exponential  form 


;tf-t,  )  =  e^"^3 


(1.4) 


where  S  denotes  a  t ime- independent  operator.  An  important  class 
of  evolution  operators  is  included  in  the  following  definition 


n  a 

S  =  L  =  L  f(x,...,x)~ 

)  =  i  J  1  h  ax 


(1.5) 


where  the  dimension  or  number  of  degrees  of  freedom  of  the  system 
may  be  finite  or  infinite.  The  finite  dimensional  case  may  be 
directly  related  to  the  corresponding  initial  value  problem 
produced  by  the  set  of  ordinary  differential  equations1, 
i  =  f  (x  ,x  , . . . ,x  ) .  Since  almost  every  partial  differential 

J  J  1  *  N 

equation  with  initial  conditions  can  be  cast  into  an  infinite  set 
of  ordinary  differential  equations  through  an  appropriately 
chosen  basis  set  expansion,  we  may  consider  the  Lie  operator  in 
Eq.  (1.5)  as  capable  of  treating  a  wide  class  of  problems.  Some 
caution  is  still  required  since  the  coefficients  in  Eq.  (1.5) 
are  scalars  while  some  formal  reductions  of  partial  differential 
equations  to  ordinary  differential  equations  can  produce  matrix 
coefficients.  In  summary,  we  restrict  ourselves  to  operators 
having  the  structure  of  Eq.  (1.5)  and  of  finite  order  N. 

Lie  operators  also  arise  in  other  areas  besides  that 
mentioned  above.  For  example,  the  investigation  of  analytic 
simplectic  maps2  and  the  description  of  the  behavior  of  tra¬ 
jectories  near  a  reference  trajectory  for  a  general  Hamiltonian 
system3  are  also  other  applications.  This  latter  work  is  distinct 
from  the  present  paper  where  we  seek  a  global  approximation  to 
the  evolution  operator  that  is  valid  within  a  region  of 
space  without  regard  to  a  reference  trajectory.  In  addition,  our 
approximation  of  factorizing  the  exponential  evaluation  operator 
into  a  product  sequence  of  global  operators  is  different  from 
that  developed  before.  Recent  additional  works4-6  have  con¬ 
sidered  the  use  of  Lie  transf ormations  to  perform  parameter 
space  mapping  of  the  solution  of  ordinary  differential  equations. 
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space  mapping  of  the  solution  of  ordinary  differential  equations. 
Other  applications  may  also  be  found. 

The  remainder  of  this  paper  is  organized  in  the  following 
fashion.  Section  II  gives  the  general  formulation  of  the  global 
factorization  for  one-dimensional  systems  followed  by  a  generali¬ 
zation  to  multi-dimensional  systems  in  Section  III.  Some 
illustrative  examples  are  treated  in  Section  IV  and  concluding 
remarks  are  given  in  Section  V. 

I I .  Pactor ization  Procedure  in  the  One-Dimensional  Case 
Lie  exponential  evolution  operators  defined  by  Eqs . 

(1.4)  and  (1.5)  frequently  arise  in  many  applications.  One 
application  that  was  mentioned  above  arises  in  the  treatment  of 
ordinary  differential  equations.  In  particular,  if  we  can 
evaluate  the  effect  of  the  Lie  transformation 

Q  =  etL  ;  L  -  f (x)  •  v  (II. 1) 

on  the  position  vector  x  around  a  point  a  in  the  phase  space  of  a 
system  defined  by 

i  =  f(x)  ,  (II. 2) 

then  the  solution  to  these  equations  may  be  written  as 

!U.‘)  =  I'’1  !]„  <”-3> 

where  x,a  and  v  are  defined  in  the  following  manner 


a, - aj 


(II- 5) 


T 

& 
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(II. 6) 


This  relation  between  the  solution  of  ordinary  differential 
equations  and  Lie  transformations  may  conversely  be  used  to 
determine  the  action  of  the  operator  Q  on  the  position  vector  by 
solving  the  following  ordinary  differential  equation 


£(x,t)  =  f  U)  ;  £(x,0)  =  2 


(II. 7) 


This  approach  to  determining  Q  is  generally  not  preferable  since 
Eq.  (I I. 7)  is  often  only  soluble  by  elaborate  numerical  tech¬ 
niques  which  will  hide  the  important  structure  of  the  desired 
transformation.  Although  the  approach  pursued  here  is  also 
approximate,  it  will  still  leave  the  structure  of  the  evolution 
operator  rather  apparent . 

In  order  to  appreciate  the  approach  taken  below,  we  recall 


some  important  properties  of  Lie  transformations 
etLlf (x)g(x) 1  =  [etLf(x)I  (etLg(x) 3 

etLf(x)  =  f(etLx) 


(11. 8) 

(11. 9) 


The  first  of  these  equations  states  that  a  Lie  transformation  on 
a  product  of  two  functions  f(x),  g(x)  can  be  factorized  to  the 
product  of  the  Lie  transformation  on  the  individual  functions. 
This  property  is  due  to  the  exponential  structure  of  the  Lie 
transformation  along  with  application  of  the  Leibnitz  rule  of 
differentiation,  and  the  relation  is  valid  provided  that  f  and  g 
are  infinitely  differentiable  functions.  The  penetration 
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property  in  Eq.  (I I. 9)  also  follows  due  to  the  particular 
structure  of  the  Lie  transformation  and  the  assumed  infinitely 
differentiable  nature  of  the  function  f.  Finally,  one  additional 
well  known  property  of  Lie  transformations  concerns  the  special 
case  of  the  translation  operator 

e'i’S  f(x)  =  f(x  +  t  a)  (II. 10) 

which  followed  from  a  simple  Taylor  expansion  of  the  right  hand 
side . 

We  now  desire  to  investigate  the  factorization  of  Lie 
transformations  for  one-dimensional  systems.  Although  the  one- 
dimensional  nature  of  the  problem  makes  it  formally  rather 
simple,  this  case  also  provides  the  best  means  to  develop  the 
factorization  scheme  presented  here.  In  this  case  the  Lie 
transformation  can  be  written  as 

Q  =  exp[tf(x)  f^]  (II. 11) 

where  f(x)  may  have  a  number  of  zeros  with  one  assumed  to  exist 
at  the  origin  of  the  complex  x-plane.  This  assumption  about  the 
location  of  a  zero  of  f(x)  at  the  origin  does  not  create  any  loss 
of  generality  since  a  simple  translation  can  bring  one  of  the 
zeros  of  f(x)  to  the  origin.  The  assumption  about  the  existence 
of  at  least  one  zero  of  f(x)  is  more  restrictive.  However,  in 
problems  where  f(x)  forms  the  right  hand  side  of  an  ordinary 
differential  equation,  there  will  usually  be  at  least  one 
stationary  point  for  the  solution.  Therefore,  the  assumption 
about  the  existence  of  a  zero  of  the  function  f(x)  may  be 
regarded  as  a  minor  loss  of  generality. 
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We  may  now  make  the  additional  assumption  that  the  function 
f(x)  may  be  expanded  in  a  Taylor  series 

f(x)  =  fx'  IxKp  (11-12) 

where  the  expansion  coefficients  f^  are  taken  as  known  from  the 
definition  of  the  system.  The  expansion  above  implies  that  the 
system  is  well  characterized,  at  least  in  a  restrictive  domain 
around  the  origin  of  the  complex  x-plane.  We  seek  the  factori¬ 
zation  of  the  evolution  operator  Q  such  that  every  factor  has  an 
independent  contribution  in  a  fashion  analogous  to  each  term  in 
the  Taylor  series  of  Eq .  (11.12).  To  this  end  we  define  the 
flexible  factorized  structure 

Q  =  exp[tf(x)§^]  =  Jgi  exp[oj(t)x3!^]  (11.13) 

where  o^(t)  are  arbitrary  at  this  point  and  yet  to  be  determined. 

Equation  (11.13)  is  the  factorization  formula  for  the  one¬ 
dimensional  case. 

Por  the  one-dimensional  case  the  factorization  in  Eq .  (11.13) 
may  seem  to  be  unnecessary  due  to  the  fact  that  the  equation  x  = 
f(x)  can  be  solved  by  the  usual  techniques  of  numerical  analysis. 
However,  in  order  to  gain  insight  into  the  more  interesting 
multi-dimensional  case,  the  present  reduced  case  presents  the 
best  way  to  understand  the  theory.  Despite  the  existence  of 
some  attempts  to  factorize  Q  by  time  ordering  techniques  with 
respect  to  t,  to  our  knowledge  there  has  been  no  factorization  of 
Q  along  the  lines  presented  in  Eq.  (11.13)  except  Dragt's  work 
for  a  different  purpose  and  in  a  different  framework. 
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Assuming  that  (11.13)  holds  and  the  coefficients  are 
known,  it  is  a  simple  matter  to  determine  the  effect  of  the 
operator  Q  on  x.  For  this  purpose  we  can  investigate  the  indi¬ 
vidual  effects  of  the  factors  in  Eq .  (11.13) 


At  this  point  we  need  to  determine  the  coefficient 


functions  Oj .  To  this  end  we  can  use  the  following  relation 


as  . 
at 


-  h  )<>  ■  <5<°)  ■  1 


(11.19) 


which  follows  from  Eqs .  (11.12)  and  (11.13).  If  we  now  write 


Q  =  Q(1)Qa  =  exp[0l(t)x  |j]  Qi 


(11.20) 


we  may  arrive  at 


exP[-°i(t)x§-]  [f(x)f- 


°ix  ]exP[°i( t)x  |j]  Qi 


(11.21) 


using  the  properties  in  Eqs.  (II. 8)  and  (II. 9).  This  result  may 
be  re-expressed  as 


i  =  ff[exP[-°*x§J  X] 

lexp[-0lx  §j]  x 


-  -  *1  x  Qi 


(11.22) 


The  following  formula 


exp [-oi  ( t)  x  |x]  x  =  exp(-Oj  (t ) )  x 


(11.23) 


allows  for  a  rewriting  of  Eq.  (11.22)  utilizing  the  expansion  in 
Eq.  (11.12) 


It  =  (fi  “  6i)  +  (f 2exp(-ox(t) )  x)  +  .. 


x  k  Qx 


(H.24) 
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The  operator  acting  on  Q,  on  the  right  hand  side  of  Eq .  (11.24) 

Is  a  power  series  In  x.  Each  of  the  terms  of  this  series  Is 
independent  and  In  the  vicinity  of  the  origin  the  first  term  will 
be  dominant.  We  desire  to  make  Qx  as  slowly  varying  as  possible 
and  therefore  demand  that  the  leading  term  in  the  series  vanish 
for  this  purpose 

6j(t)  =  ;  o^O)  =  0  (11.25) 

The  initial  condition  has  been  taken  as  zero  to  make  the  simple 
evolution  operator  qO)  unitary.  Equation  (11.24)  now  has  the 
form 

ao 

-at  -  f<  ’<*>  I;  >  V°>  -  1 
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where  f^l3(x)  can  be  identified  from  the  remaining  series  of 
terms  in  the  brackets  of  Eq .  (11.24)  and  f(*)(0)  is  finite. 
Exactly  this  same  logic  may  be  put  forth  to  evaluate  o2(t)  by 
successively  eliminating  higher  order  powers  of  x  in  the  differ¬ 
ential  equation.  To  construct  a  general  recursion  we  assume 
knowledge  of  the  first  n  of  the  o^'s  and  write 

Q  =  I  n  Q(3)}  Q  (11.26) 

lj  =  i  J  n 

which  suggests  the  equation 


dQ 


at 


£1  -  f  (  "  ) 


-  f  <*)  *  ^  Q. 


n  + 1  a_ 
ax 


Qn(0)  =  i 


(H-27) 


i 
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The  time-dependence  of  f(°)(x)  is  not  explicitly  shown  and  the 
function  f(°)(x)  is  regular  at  the  origin  of  the  x-plane  and  is 
to  be  determined.  We  now  may  write 


Q  n  *  Q 


(n  +  l) 


Qn+l  =  exP[°n+l(t)xn+1|^]  Qn+l  (11.28) 


and  obtain 


n  +  l 


at 


f  (n)  [exp[-an  +  1(,)xf'+1|7]x]-  on  +  1 


rfl+  1 


67  Qn  +  1 


(11.29) 

by  utilizing  again  the  properties  in  Eqs.  (II. 8)  and  (II. 9). 
Employing  the  action  of  the  factorization  operator  in  (11.18) 
gives 


*»n  +  i 

at 


rf(n)f  X  .  .  1 

-  6  1 

L  ( 1+no  ( t )  x")l/n  J 

n  +  lj 

(11.30) 

We  now  apply  logic  analogous  to  that  leading  to  Eq.  (11.25)  and 
eliminate  the  dominant  contribution  to  the  bracketed  quantity 


multiplying  the  operator  xn+l  yielding 


(n) 


°n+  1  =  *'°)  S  ^n+1(°)  =  0 


where  the  initial  value  is  again  chosen  as  zero  to  make  Q 
unitary.  Therefore  we  conclude 


(H.31) 
(n+i ) 


^^n+l  -.(n+l).  .  n+2  d 

m  f  <x)  x  a7  Qn+i 


at 


(11.32) 


where 


f'"+1)cx)  =  A 


(n) 


(l+non+x  (t)  xn)L/ 


i/n 


(n) 


(0) 


(11.33) 
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This  is  a  first  order  recursion  relation  with  the  initial 
condition 

f(l)(x)  =  [f (exp(-0!(t) )  x)  exp(Oj (t) )  -  fxx]  (11.34) 

All  of  the  cr-f unctions  can  be  evaluated  analytically  in 
principle,  however  this  is  a  tedious  task  and  the  use  of  a 
symbolic  programming  language  such  as  MACSYMA  or  REDUCE  is 
recommended.  The  first  five  of  the  o-f unctions  are  given  below. 


Ol(t) 

=  fit 

(11.35) 

o2(t) 

=  8i(t) 

(11.36) 

o3  (t) 

«  f3  ga(t) 

(11.37) 

f  f  2  f  3  ! 

I  f2f3 

o4(t) 

»  f4+  -J-11 

8a(t)  ~  8a(t) 

(11.38) 

1  I  i 

L  f2f* 

f2fal 

os(t) 

-  -rr 

+  yjry—  8* ( t)  + 

e 
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If  the  infinite  product  representation  of  Q  given  by  (11.13) 
converges,  then  the  following  result  will  hold. 

=  Qx  =  exp[t£(x)|j]  x  =  lim  ln  (II.42) 

Since  the  action  of  Q  on  x  defines  the  fundamental  operations 
of  concern,  we  now  focus  our  attention  on  the  f -approx imants . 

A  recursion  relation  for  these  approximants  can  be  obtained  by 
first  noting  that 

fn+i  =  {jSiQ1}  «xp[°n+i(t)  xn+1|^]  x  (11.43) 


An  application  of  Eq.  (11.18)  yields 


1 - 1  1 

1  -  non+1  xnJn 


(11.44) 


Since  a  product  of  Lie  transformations  is  again  a  Lie 
transformation,  we  may  use  the  property  in  Eq.  (I I. 9)  along  with 
Eq.  (11.41)  to  conclude  that 


?n+ i (x  *  t ) 


?n(x,t) 

[l  -  n  on+1(t)  ?R(t)]l/n 


(11.45) 


This  i 8  a  rather  simple  first  order  recursion  (difference 
equation)  whose  initial  member  is  evaluated  as  follows 


?i(x,t)  =  exp [ot ( t )  x  x  =  x  exp(ox ( t ) )  =  x  exp(fxt) 

(11.46) 

Although  this  is  a  simple  recursion  relation,  it  is  not  typically 
suitable  for  numerical  purposes.  Numerical  instabilities  will 
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occur  if  fj  is  negative  resulting  in  excessively  small  quantities 
for  large  times  t  or  also  under  the  conditions  that  x  tends  to 
zero.  In  these  cases,  error  accumulations  may  occur  due  to  the 
truncated  arithmetic  on  the  computer.  To  prevent  this  error  we 
may  renormalize  the  f-approximants  and  define  a  new  recursion 
relation 

£n 

t  n+i  =  f - p: - ;  €1  *  1  (H.47) 

[1  ~  °n+ix  *njn 

where 

6n+1  =  non+1  exp(nfAt)  (11.48) 

The  relation  between  the  new  approx imants  and  the  previous  ones 
is 

In(x,t)  =  €n(x,t)  x  expCfjt)  (11.49) 

which  also  implies 

5(x , t )  =  exp[tf(x)|^]  x  =  6 (x,t)  x  exp(fAt)  (11.50) 

Since  the  term  x  exp(fxt)  characterizes  the  linear  response  of 
the  system,  we  can  consider  {(x,t)  as  a  function  measuring  the 
deviations  of  the  system  from  its  linear  response.  We  will 
accordingly  refer  to  (  as  a  "deviation  function".  As  can  be 
easily  seen,  the  ?  and  {-approximants  have  branch  points  which 
move  on  trajectories  in  the  x-plane .  The  location  of  these 
trajectories  determines  the  convergence  regions  of  the  approxi- 
mants.  We  shall  leave  the  discussion  of  this  issue  and  a 
comparison  of  the  {-approx imants  with  Pade  approximants  to  a 
companion  paper. 
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III.  GENERALIZATION  OF  THE  FACTORIZATION  SCHEME  TO 

TOE  MULTIDIMENSIONAL  CASE 

The  logic  put  forth  in  section  II  for  a  systematic  factori¬ 
zation  of  one  dimensional  evolution  operators  may  now  be 
generalized  to  multidimensional  cases.  In  this  situation,  the 
evolution  operator  acting  in  a  space  of  dimension  n  has  the  form 


Q  =  exp( t  f  (x) -v) 
where 

*T  =  Ixj , . . . ,xN1 

#w 

f  _  3  3 

~  “  3xj ’ ‘  ‘ ’5xn 

f T  =  If J (X) , . . . ,  fN(x) 3 


(III  .1) 


(111. 2) 

(111. 3) 
(HI.  4) 


The  function  f(x)  is  assumed  to  have  a  zero  at  the  origin 
•*» 

uT+o  £<i>  ■  0 

*** 

and  it  is  also  assumed  to  be  expandable  in  a  multidimensional 
Taylor  series  in  the  variable  xx,...xN.  This  latter  expansion 
can  be  written  in  tensor  form  as 


fx  "  XJ  +  f1)K  xjxk  +  f  1  ®  k  1  x  J  3CKX !  +  ... 

(III. 6) 

where  the  convention  of  the  explicit  summation  over  repeated 
indices  is  used  for  convenience. 

In  the  one-dimensional  case  the  operator  exp(oxx^)  played  a 
fundamental  role  in  the  first  step  of  establishing  a  recursion 
relation  for  the  approximants .  The  same  situation  occurs  again 


here  and  we  shall  denote  this  first  degree  operator  QL  as 
taking  on  the  following  form 

Ql  =  exp(xTo(1)v)  (III. 7) 

where  o^1)  is  a  square  matrix  or  equivalently  a  second  degree 
tensor.  The  effect  of  this  operator  on  the  position  vector  x 
is 

Qlx  =  exp(oT ( 1 ) )  x  (III. 8) 

Since  QLx  must  be  the  system  linear  response  we  can  conclude  that 

ojk(1)  -  t  rj}>  ),k  -  1 - «  (III. 9) 

Henceforth,  we  shall  denote  the  linear  response  of  the  system 
evolution  by  S, 

S  =  exp( tf ( 1 ) )  (III. 10) 

Using  the  definition  of  the  scalar  product  of  two  tensors  of  the 
same  order , 

A,  i  B,  i  =  A  o  B  (III. 11) 

h  -  •  •  Jn  3  l  *  •  •  3  n 

we  can  write  the  evolution  operator  in  Eq.  (III.l)  as  the 
infinite  order  factorized  product 

OO 

Q  =  QL  n  exp[o(*Oo  (III. 12) 

k  =  2 

where  is  a  k-th  order  tensor  to  be  be  determined  and 

L ( k )  is  a  tensor  valued  operator 
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The  tensor  product  In  the  argument  of  each  exponential  term  in 
Eq.  (Ill  .12)  is  itself  a  sum  of  operators  which  would  be 
difficult  to  deal  with  in  practice.  Therefore,  we  have  further 
factorized  each  individual  term  in  Eq.  (III. 12)  (except  QL ) 
to  obtain 

Q-  Q,  n  n’  (III. 14) 

where  it  is  understood  that  the  coefficient  functions  o(K)  are 
now  distinct  from  the  set  in  Eq.  (III. 12).  The  starred  product  in 
this  formula  means  that  the  product  operation  is  performed  over 
the  entire  domain  of  the  j -indices.  There  is  no  unique  ordering 
to  the  factorization  in  Eq.  (III. 14)  for  a  multi -dimensional 
case.  However,  if  we  define  the  following  operators 

Q(n)  =  exp [o( n )  ©  L(")]  (III. 15) 

Q<">  -  n’  e*p[oj^a...Jn  L,("?..)n]  (III. 16) 

one  can  prove  that 

{§(n)x|  -  jQ(n)x)  =  o[x2n-1j  (III. 17) 

Therefore,  within  this  level  of  approximation  the  expressions 
in  Eqs .  (III. 12)  and  (II I. 14)  may  be  considered  equivalent.  The 
form  given  by  Eq.  (III. 14)  is  practical  since  each  of  the 
sequence  of  evolution  operators  acts  on  a  particular  coordinate 
and  degree  of  freedom. 


The  procedure  for  determining  the  a  -  tensor  is  the  same 
as  in  the  previous  section,  however  all  scalars,  (except  time) 
must  be  replaced  with  tensor  quantities  and  the  conventional 
algebra  must  be  replaced  with  tensor  algebra.  The  details  of 
these  operations  will  not  be  dealt  with  further  here,  but  the 
use  of  symbolic  programming  languages  would  be  most  helpful 
in  practice.  The  second  degree  o  -  tensor  is  given  below 
as  an  example. 


o 


(2) 


i  k  1 


|  S,.<r) 


dr 


(III . 18) 


The  evaluation  of  the  {  -  approximants  can  again  be  accom¬ 
plished  by  using  the  consecutive  effects  of  the  individual 
factors  of  the  evolution  operator.  Symbolic  programming 
techniques  would  likely  be  the  best  procedure  for  determining 
the  xlf...,xN  and  t  dependence  of  the  t  -  approximants. 
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IV.  ILLUSTRATIVE  APPLICATIONS 

In  this  section,  five  problems  are  considered,  each  of 
which  exhibits  different  types  of  behavior.  For  the  sake 
of  comparison  with  the  techniques  introduced  in  the  previous 
sections,  we  have  chosen  analytically  soluble  problems  as 
described  below. 

i)  f(x)  =  l  -  e*  (IV. 1) 

From  traditional  linear  stability  analysis  arguments  this 
system  is  stable  for  x>o  and  unstable  for  x<o.  There  is  also 
only  one  steady  state  point  located  at  the  origin.  An  analytic 
expression  for  the  effect  of  the  Lie  transformation  on  x  can  be 
written  as 

£(x,t)  =  exp[tf(x)|^]  x  =  -  in  [i-(i-e-*)  e_t]  (IV. 2) 

A  careful  examination  of  the  structure  of  £(x,t)  reveals  that  its 
branch  point  traverses  the  path  from  -**>  to  +«*>  along  the  horizon¬ 
tal  axes  Tin  as  time  evolves.  Figure  la  plots  the  exact 
deviation  function  £(x,t)  and  its  first  five  approximants  £n(x,t) 
as  defined  in  Eqs.  (11.50)  and  (11.49)  respectively  for  the 
case  x=o.i.  It  is  apparent  that  the  approximants  uniformly 
converge  to  the  true  deviation  function  as  n  increases.  The 
error  between  the  true  deviation  function  and  the  n=s 
approx i man t  is  shown  in  Figure  lb  where  it  is  apparent  that 
the  error  decreases  monotonically  to  an  asymptotic  value 
for  large  times.  A  similar  pair  of  plots  is  shown  in  Figure  2 
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for  x—  5.0.  At  this  larger  value  of  x  qualitatively  similar 
behavior  occurs  but  the  rate  of  convergence  of  the  approx imants 
is  slower  and  the  peak  in  the  error  function  may  be  a  signal  of 
the  loss  of  global  convergence.  The  situation  for  negative  values 
of  x  is  different  as  shown  in  Pigure  3.  Figure  3  presents  the 
case  for  x=-i.o.  The  approximants  in  this  case  seem  to  show 
oscillatory  nonmonotonic  behavior  with  regard  to  true  deviation 
function.  The  error  of  each  of  the  approximants  is  qualitatively 
similar  to  that  of  Figures  1  and  2.  At  a  sufficiently  large 
negative  value  of  x  singular  behavior  shows  up  resulting  in 
apparent  non-convergence. 

ti)  f(x)  =  l  -  e~*  (IV. 3) 

This  system  is  unstable  for  positive  x  values  due  to  the 
first  Taylor  coefficient  being  positive.  It  has  only  one  steady 
state  point  located  at  the  origin  of  the  x-plane.  The  analytic 
expression  of  the  Lie  transformation  effect  on  x  is 

?  ( x , t )  =  ln{i+(ex-i)et}  (IV. 4) 

The  branch  point  trajectory  of  this  system  matches  with  the 
negative  portion  of  the  real  axis  of  the  x-plane.  The  branch 
point  moves  on  this  line  towards  the  origin  as  time  evolves 
and  reaches  there  in  the  limit  that  t-»®  .  Figure  4  shows 
the  deviation  approximants  and  the  error  of  the  fifth  member 
for  x=o.i.  Apparent  convergence  failure  is  observed,  however 
during  a  finite  time  interval  starting  from  t=o  there  is 
temporary  convergence. 
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i 

111)  f(x)  =  sin  x  (IV. 5) 

1 

.  This  system  is  unstable  around  x  =  o  for  x>o,  however  it  has 

>  infinitely  many  steady  state  points  and  they  alternatively  make 

> 

j*  the  system  either  stable  or  unstable.  Figures  5  depicts  the 

'  approximant  behavior  for  the  case  x=i.o.  There  is  apparent  con¬ 

vergence  behavior  in  the  figure,  however  a  peak  in  the  error 
•  function  may  again  be  a  signal  of  the  loss  of  global  convergence. 

It  is  difficult  to  prove  this  point  from  only  a  finite  number  of 

I 

I  approximants .  An  additional  calculation  is  shown  in  Figure  6  for 

x  =  5.0  which  is  beyond  the  second  stationary  point  of  sinx. 

|  This  well  behaved  nature  of  the  approximants  is  probably  due  to 

t 

the  fact  that  all  the  branch  points  of  this  system  are  purely 
4*  imaginary. 

i v )  Stakgold  problem7 

^  This  problem  is  associated  with  the  consideration  of  two 

coupled  nonlinear  differential  equations  with  system  coefficients 

( 

[  given  by 

f l(x1,xa)  =  XXx  -  x2  -  xx(x2+x2) 

f 2 (X1 »x2 )  =  *x2  -  X1  -  X2(x?+x2)  (IV. 6) 

The  analytic  expression  for  the  effect  of  the  Lie  transformation 
on  the  position  vector  is 

f1(t)  =  (x1co8  t  -  x2sm  t)  e~,Xlt  i7(x,t) 

f  a  (t)  =  (x1sin  t  +  xacos  t)  e  7j(x,t)  (IV. 7) 

where  X  is  assumed  to  be  negative  and  y  is  defined  as  follows 
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T?(X,t)  = 


^  X?  +  x^ 

f  -2 1 X 1 t) 

1  +  — m — 

[!  -  e  j 

-1* 


(IV. 8) 


This  system  is  stable  as  long  as  X  remains  negative.  In  the 
case  of  positive  X  the  same  condition  again  holds  but  the 
system  does  not  have  a  steady  state  solution  and  a  limit  cycle 
appears . 

In  applying  the  method  of  Section  III  to  Eq.  (IV. 6)  we  will 
find  that  the  system  has  only  the  following  non  zero  tensor 
coefficients 


f(D_  x 
1  i  i  K 


f(l)-  -i 
1  1  2  1 


f(D  = 


2  1 


=  1 


f(l)=  X 

r  2  2  *• 


f<3) 


1111  - 


(3) 


1122 


f(3) 

1  2  2  1  1 


fO)  ~  -1 
r2  2  2  2  1 


(IV. 9) 

(IV. 10) 


Accordingly,  the  linear  response  term  would  be  expressed  by 
the  tensor 


S  =  exp( tf ( 1 ) ) 


(IV. 11) 


and  elements  of  this  matrix  and  its  inverse  are  given  by  the 
following  expressions 


S 


n  = 

ext  cos  t 

Sia  = 

-ext  sin  t 

2  1  = 

ext  sin  t 

S  2  2  = 

ext  cos  t 

(IV. 12) 

(-D 

1  1 

=  e-xt  cos 

t 

g  (  ~  1  ) 
51  2 

=  e“xt  sin 

t 

(-D 

2  1 

=  e-xt  sin 

t 

g  (  ~  1  ) 
“2  2 

=  e_xt  cos 

t 

(IV. 13) 

T 


Since  the  second  degree  Taylor  expansion  coefficients  are  zero 
it  follows  that 


d 


o<2)  =  o 


(IV. 14) 


The  third  order  terms  are  non  zero  and  a(3)  may  be  shown  as 


n(3)  -  f  s  f<3>  g<_1>  g(_1)  s*"1)  dr 

°i1i2i3i4  ~  blimi  Im1m2m3m4  bm2i2  bm3 1 3  bm4 1 4  aT 


(IV. 15) 

where  the  explicit  summation  rule  over  repeated  indices  is 


employed.  After  some  tedious  algebra,  one  can  show  that  all 


elements  of  the  o(3)-tensor  vanish  except  for  the  following 


four  members 


(t)  =  o 


=  „(») 


(t)  =  o 


=  „<3) 


2  2  11 


(  i  -aAt 

-  0222  2  (t)  =  £  °3(t) 


(IV. 16) 


This  result  immediately  yields  the  tensor  product 


o(3)  o  L(3)  -  o3(t)  [xj  +  xj]  [xi  +  x2  ] 


(IV. 17) 


As  can  be  easily  observed  the  operators  o(3)  o  l( 3 )  and 


f(3)  ©  l C 3 )  commute  and  therefore  there  will  be  no  contribution 


from  higher  degree  terms  of  the  remainder  during  the  elimination 


of  the  operator  o(3)  e  l(3)  from  the  structure  of  L.  In 


addition,  there  are  no  higher  order  terms  than  these  already 


coming  from  the  structure  of  L  itself.  Hence,  we  may  conclude 


that  the  factorization  exactly  truncates  at  its  second  order 


terms  if  we  retain  o(3)  ©  L^3)  as  a  global  second  degree 


'  w*  V 
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Lie  operator.  Indeed,  if  we  write 

Qx  =  exp[xT  fT(J)-  v]  exp[o(3)  o  L^3)]  x 

os  6] 

in  eJ  , 

then  it  follows  that 

Qx  = 


a  1 - 1 - ,u 

li  -  203(t)r2J* 


(IV. 18) 


ext 

cos  t  -sin  t 

*i 

[1+fe2Xt_n  +  xf 

L  1  J  X 

sin  t  cos  t. 

■  X  2  ■ 

(IV. 19) 


and  the  exact  result  is  obtained.  In  this  result  we  have 
first  used  polar  coordinates 


r  =  [xi2+  Xjj1*  cos  B  =  x^xi+x2] 


cj(  3 )  o  l( 3 )  =  o3(t)r3 


S? 


(IV. 20) 


and  then  returned  to  the  cartesian  representation  in  Eq.  (IV. 19). 

The  result  in  Eq.  (IV. 19)  is  just  a  confirmation  of  the 
operator  algebra  introduced  earlier  in  the  paper.  We  may  still 
go  a  step  further  and  factorize  the  evolution  operator  involving 
o ( 3 )  o  l(3)  to  obtain 

Q(2)  =  exp[o3(t)  x3  |^-]  exp  [a3  (t)  x2^ 
exp[o3(t)  x3  xx  1^-]  exp [o3  ( t )  x3 

(IV. 21) 

This  different  factorization  creates  an  error  which  is  of  the 
order  of  magnitude  of  fifth  degree  terms.  In  this  case, 
an  infinite  product  appears  which  converges  about  the  origin. 


v)  space  extension 

Consider  the  system  defined  by  the  function 

f(x)  =  y/T-xl~  (IV. 22) 

This  function  has  two  zeros  located  at  the  points  x=+i  and 
x=-i,  however  it  does  not  fulfill  the  requirements  for  our 
method.  In  particular,  it  cannot  be  expanded  in  a  Taylor  series 
around  these  points.  Nevertheless,  the  problem  may  still  be 
approached  by  extending  the  space  to  two  dimensions  through  the 

introduction  of  a  new  variable  in  addition  to  x  as  follows 

y  =  ✓V=x3  (IV. 23) 

We  can  now  define  a  new  system  with  the  descriptive  functions 

fi(*»y)  =  y 

f2(x,y)  =  -x  (IV. 24) 

This  new  system  satisfies  all  of  the  necessary  conditions  for 
factorization.  Therefore,  in  cases  such  as  these,  the  technique 
of  space  extension  may  make  it  possible  to  factorize  Lie  trans¬ 
formations  which  otherwise  might  not  admit  to  direct  treatment. 


V.  CONCLUDING  REMARKS 


The  basic  thrust  of  this  paper  is  the  development  of  a  new 
sequence  of  approximants  appropriate  for  time  evolution 
operators  with  Lie  generator  arguments.  Although  we  have  not 
given  convergence  theorems  for  the  ^-approximants  sequences, 
the  results  in  Section  IV  are  encouraging.  Rapid  and  highly 
accurate  convergence  seems  to  be  obtained  at  least  in  a 
sufficiently  closed  vicinity  to  the  origin.  The  next  step  in 
this  work,  examined  in  a  companion  paper,  is  the  investigation 
of  the  {-approximant  singularities  and  some  convergence  theorems. 

Actual  implementation  of  the  factorization  scheme,  especially 
for  multidimensional  cases  can  involve  a  considerable  amount  of 
algebra.  The  use  of  symbolic  programing  on  the  computer  would 
likely  be  a  necessity  in  these  cases,  and  this  issue  also 
needs  further  investigation  for  its  practical  implementation.  A 
number  of  applications  of  the  factorization  may  be  envisioned  as 
suggested  in  the  introduction.  Evolution  operators  of  the  type 
studied  in  this  paper  occur  in  a  wide  variety  of  problems,  but 
perhaps  the  most  obvious  and  simple  application  would  be  to  the 
solution  of  ordinary  differential  equations.  The  possible 
attraction  here  follows  from  the  fact  that  the  approximants 
provide  a  global  solution  in  time  rather  than  the  usual 
sequential  time  stepping  procedures.  A  number  of  numerical 
issues  need  to  be  addressed  for  this  case  as  well  as  other 
applications  before  the  optimal  realm  of  utility  of  the 
scheme  may  be  established. 
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Figure  Captions 


Figure  1 


Figure  2. 


Figure  3. 


Figure  4. 


Plot  of  the  exact  deviation  function  £(x,t)  and 
its  first  five  approximants  $n(x,t),  n=i,...s 
for  the  characteristic  function  in  Eq.(IV.l)  where 
x=o . x  In  figure  a,  the  last  three  approximants  are 
indistinguishable  from  the  exact  result.  Figure  b 
shows  the  deviation  function  for  the  approximant 
n=5.  The  same  line  masks  in  figure  a  will  be  used 
in  the  remaining  {-approximant  plots. 

The  same  as  figure  1,  except  x=s.o.  These  results 
are  qualitatively  similar  to  those  of  figure  1 
except  now  the  convergence  rate  is  slower  and  there 
is  a  peak  in  the  error  function. 

The  same  as  figure  1,  except  now  x  -  -i.o. 

Apparent  oscillatory  nonmonotonic  behavior  is 
exhibited  with  respect  to  the  true  deviation 
function  in  figure  a. 

Figure  a  exhibits  the  exact  deviation  function 
{(x,t)  and  the  first  five  approximants 
£n(x,t),  n=i,...s  corresponding  to  the  fundamental 
function  in  Eq.  (IV. 3)  at  x  =  o.i.  Figure  b  shows 
the  error  function  for  n=s  approximant.  Apparent 
divergence  behavior  is  observed  at  long  time; 
however,  during  a  finite  interval  around  the  origin 
there  is  temporary  convergence.  Here,  only  the 
fifth  approximant  goes  to  infinity.  However,  the 
first  four  approximants  also  have  branch  points 
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close  to  zero  but  they  do  not  make  the  denominator 
in  the  corresponding  transformation  from  the  pre¬ 
vious  approximant  zero  when  t<io.  Some  of  the 
branch  points  are  not  even  on  the  positive  real 
axis . 

Figure  5.  The  exact  deviation  function  £(x,t)  and  its  first 

five  approximants  £n(x,t),  n=i,...s  for  the 
characteristic  function  in  Eq.  (IV. 5)  at  x=i.o. 

The  first  and  second  approximants  coincide  as  well 
as  the  third  and  fourth  approximants.  There  is 
apparent  convergence  behavior  in  Figure  a, 
however  a  peak  in  the  error  function  in  Figure  b 
may  signal  a  loss  of  global  convergence. 

Figure  6.  The  same  as  Figure  5,  except  that  now  x=s.o. 
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I.  INTRODUCTION 


Consider  the  parameterized  nonlinear  system 

v*o(p)  .  /  *(<’?)  =  /(*(<, P),P)  +  ^  9iTiUp),P)  n) 

~p  ‘  \y(<»p)  =  p)»p).  *(0,p)  =  *o(p) ' 

Let  M  and  fl  be  bounded,  connected,  and  open  sets  in  /T*  and  Z?9,  respectively, 
such  that  x  £  M  and  pE  H,  where  p  represents  the  constant  parameter  vector.  We 
assume  that  the  vector  fields  f(-,p)  and  g{-,p),  and  the  function  h(-,p)  :  M  —  R™ 
are  real  analytic  on  M  for  all  p  G  fl.  The  problem  considered  here  is  identifiability 
of  (1)  in  the  experiments  specified  by  (xo (p),  U[0,  <1  ])  ,  where  x0(p)  denotes  the 
(possibly  parameterized)  initial  state,  and  tr [0,  < i  ]  is  the  set  of  bounded  and  mea¬ 
surable  controls  defined  on  [0,  <i].  Let  denote  the  input-output  map  of  (1). 

Parameter  values  p  ,  p  6  0  are  said  to  be  indistinguishable  (  denoted  by  p  ~  p  )  in 
the  experiments  (x0 (p),  U[0,  ])  if  Sp°^(tt)  =  X^°^(u)  for  all  u  £  U[0,  <j].  System 

(1)  is  globally  identifiable  at  p  if  p  ~  p,  p  G  Q  ,  implies  p  =  p.  System  (1)  is  locally 
identifiable  at  p  if  there  exists  an  open  neighborhood  W  of  p  in  fl  such  that  p  =  p, 
p  €  W"  ,  implies  p  =  p. 

A  summary  of  results  on  local  identifiability  of  (1)  is  given  in  [1].  These  results 
are  based  on  three  factors:  (i)  the  relationship  between  local  identifiability  and  lo¬ 
cal  observability  of  the  system  augmented  with  the  parameters  as  additional  state 
variables;  (ii)  the  functional  expansion  of  the  input-output  map  of  (1),  and  (iii) 
the  local  state  isomorphism  theorem  of  nonlinear  realization  theory.  While  the  first 
approach  is  inherently  local,  functional  expansions  (e.g.,  Taylor  and  generating  se¬ 
ries)  enable  one  to  study  also  global  identifiability  by  formulating  a  set  of  algebraic 
equations  for  the  parameters[2,3j.  Except  linear[2],  bilinear[4],  and  homogeneous 
polynomial(5]  systems  there  exists,  however,  no  a  priori  upper  bound  on  the  number 
of  series  coefficients  to  be  considered,  and  hence  lie-  resulting  conditions  are  suffi¬ 
cient  but  not  neccessary  from  a  practical  point  of  view.  The  structure  of  nonlinear 


equations  is  far  from  simple  (see,  e.g.,[6]),  their  number  is  large  even  for  bilinear 
and  polynomial  systems,  and  hence  global  identitiability  properties  are  difficult  to 
establish  in  most  applications. 

The  goal  of  this  note  is  to  extend  the  state  isomorphism  approach  to  the  anal¬ 
ysis  of  global  identifiability  in  nonlinear  systems.  In  addition  to  its  analiticity  we 
assume  that  system  (1)  is  locally  reduced  at  ro(p)  for  all  p  €  0  ,  i.e.,  it  satisfies 
both  the  controllability  rank  condition  (C.R.C)  and  the  observability  rank  condition 
(O.R.C)!7j. 

The  problem  of  global  identifiability  is  stated  as  follows. 

Problem  statement:  Given  (1)  and  p  £  fl,  find  all  p  €  fl  and  systems  of  the  form 

v*ntf>  •  J  *(<>P)  =  /(i(«.p),p)  +  u  g(x(t,p),p)  , 

l  y(t,p)  =  h(£(t,p),p),  x(0 ,p)  =  x0(p)  =  x0(p) 

such  that 

E;°(p)(u)  =  E;o(p)(u)  (3) 

for  all  u  €  f'r[0,<j]. 

It  follows  that  we  deal  with  a  highly  restricted  problem  of  system  equivalence. 
First,  both  (1)  and  (2)  are  locally  reduced  ,  and  have  the  same  subset  M  in  Rn  as 
their  state  spaces.  Second,  in  addition  to  the  input-output  map,  the  known  system 
structure  is  also  invariant  under  the  feasible  class  of  local  state  isomorphisms.  The 
analysis  is  based  on  the  construction  of  all  such  transformations.  This  ’dea  has 
been  applied  to  linear  systems[2,8],  where  equivalence  transformations  are  linear. 
Though  local  state  isomorphisms  between  (1)  and  (2)  generally  are  solutions  of  a 
set  of  partial  differential  equations,  their  construction  is  relatively  simple  for  locally 
identifiable  systems.  We  will  also  show  that  any  local  state  isomorphism,  preserving 
the  structure  of  a  homogeneous  system,  is  linear  Thus  the  method  is  very  simple 
for  this  class  of  systems,  and  the  known  conditions  for  global  identifiability  of  linear 


and  bilinear  systems  are  special  cases  of  the  present  results.  The  single-input  case 
is  considered  for  notational  simplicity  and  the  conditions  can  be  readily  extended. 


II.  IDENTIFI ABILITY  AND  CONSTRAINED  EQUIVALENCE 
The  following  condition  for  identifiability  is  the  immediate  consequence  of  the 
local  state  isomorphism  theorem  ([l],[7],[9i,[10])  and  the  constraint  (2)  on  the  form 
of  the  representations  of  (1). 

Proposition  1:  Consider  p,  p  6  fl,  an  open  neighborhood  V  of  x0(p)  in  /T1,  and 
any  analytic  map  X  :  \  '  PC'  defined  on  V'  such  that 


(i)  A(x0(p))  =  *0(p) 

(-D 

d\ 

(ii)  rank  —  =  n  at  all  i  G  V 

ox 

(5) 

(iii)  /(A(x),p)  =  |^/(x,p) 

(6a) 

3(A(x),p)  =  Qzg{i,p) 

(66) 

h{X(i),p)  =  h(x,p) 

(6c) 

for  all  x  f  V.  Then  there  exists  ti  >  0  such  that  (1)  is  globally  identifiable  at  p  in 
the  experiments  (x0(p),  C[0,  tj  ])  iff  conditions  (4)-(6)  imply  p  =  p. 

Proof:  (Necessity.)  Assume  that  p  ^  p,  V ,  and  X  satisfy  (4)-(6).  Introducing 
x  =  A_1(r)  into  (1)  gives 

v*0(p)  .  /  x  =  (dX/dx)  '  f(X(x),p)  +  u  (dX/dx)  1  g(X{x),p)  /yx 

~ V  \  y  =  h{X{i),p),  x0(p)  =  A_,(x0(p)). 

Select  <i  >  0  such  that  x(t,p)  £  V  for  all  u  £  f  ’ [ 0 .  #  j  ] .  where  x(t,p)  is  the  so¬ 
lution  of  the  differential  equation  in  (7)  with  the  initial  state  xo(p).  By  (ii)  X  is 
a  local  state  isomorphism  defined  on  V  and  hence  E*n<p)(u)  =  Ep°^p>(u)  for  all 
u  6  Cr[0,<i].  By  (i)  io(p)  =  A-1  (A(x0(7> '  I  -  r  i.nn.l  by  (iii)  (7)  is  represented 
by  (2). Therefore , Sp°^  (u)  =  S^0^(u)  for  all  v  £  f’[0,  tj]  ,  and  p  --  p  follows. 


<0. 
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(Sufficiency.)  If  p  yt  p,  p  p,  then  (2)  is  a  local  representation  of  (1)  on  some 
neighborhood  Vj  of  x0  (p),  and  it  is  locally  reduced.  By  the  local  state  isomorphism 
theorem  for  any  such  representation  S  of  (1)  there  exists  an  open  neighborhood  V'2 
of  io(p )  and  a  unique  analytic  diffeomorphism  A  defined  on  V2  such  that  S  is  of  the 
form  (7).  Therefore,  p  and  A  satisfy  the  conditions  (4)-(6)  on  V7  =  lj  Pi  I2.  0 

Remark  1:  Let  (1)  be  globally  identifiable  at  p.  It  follows  from  the  uniqueness 
of  the  local  diffeomorphism  A  (see,e.g.,[10])  that  the  only  pair  (p,A)  that  satisfies 
the  conditions  of  Proposition  1  is  (p,  idn),  where  idn  :  V’  — ♦  f?71  denotes  the  identity 
mapping.  Conversely,  A  ^  idn  implies  p  ^  p. 

Remark  2:  If  certain  initial  states  are  completely  known,  write  x0  (p)  =  ([ar^1  ^  ^ , 
ko2)(p)]T)T.  where  x^1'  represents  the  parameter-independent  components  of  x0. 
Then  (4)  defines  the  constraints 


on  A.  In  the  limiting  (i.e., parameter-independent)  case  x0  =  xj,1^,  and  (8)  is  reduced 
to 

A(x0)  =  xo-  (9) 


Example  1:  Consider  the  system 

=  P\*]  +  P2*iZ2  +  u  x3  (0, p)  =  i2(0,p)  =  0 
x2  =  pzx\  +  PiXxx-i  y  -  x j. 


(10) 


With  ft  =  {p  €  R*  ,Pi  7^  0,  |p,|  <  K  r  i?1 ,  K  >  0}  (10)  is  locally  reduced  at  x0  for 
all  p  £  ft.  We  construct  all  local  transformations  A  =  (Aj  .  A2)  that  satisfy  conditions 
(4)-(6).  Since  /i(x,p)  =  [1  0]x,  by  (6c) 


Ai  <t,  .3-2  )  =  •»'•, 


(11) 
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whereas  (6b)  implies  dX2/di}  =  0.  Then  (6a)  is  reduced  to 


w 


f  Pli\  +  p2i1X2\  _  (  ^  0  \  ( Pi*]  +  P2*I*2  \  (12) 

\pzi]  +  pAi^X2  J  \ 0  dX2/di2J  \psi]  +  p<i]*2  )  ' 

to  be  satisfied  on  an  open  neighborhood  of  the  origin  in  i?2 .  By  the  first  equation 
of  (12),  p!  =  pi,  and 

A2(ii  ,sc2 )  =  ^ x2 .  (13) 

Pi 

Thus  dX2/dx2  =  ih/p2-  From  the  second  equation  pz  =  P2P3/P2  and  p2p4/p2  — 
P2P4  jp2  •  Since  P2  is  arbitrary  and  pj  ^  0  by  condition  (ii),  we  have  p4  =  p4 .  The 
initial  states  are  known,  but  (9)  does  not  further  restrict  the  one-parameter  family 
of  feasible  transformations  given  by  (11)  and  (13),  where  P2  ^  0  is  arbitrary.  Thus 
(10)  is  nowhere  locally  identifiable  on  fi.  This  result  can  be  obtained  also  by  the 
methods  presented  in  [1].  In  addition  we  show,  however,  that  parameters  pi  and 
p4  are  unique,  independently  of  the  value  of  p2,  and  at  fixed  P2  =  p2  the  system 
becomes  globally  identifiable  at  all  p  €  ft.  As  shown  in  [11],  it  is  much  more  tedious 
to  establish  these  properties  by  the  generating  series  expansion  approach. 

Remark  3:  This  example  illustrates  two  important,  though  not  completely 
general  properties  of  the  present  method.  First,  note  that  the  general  solution 
X  of  the  set  (6a)-(6b)  of  first-order  linear  partial  differential  equation  depends  on 
arbitrary  functions  (see,e.g.,[12]).  If  (1)  is  locally  or  globally  identifiable, then  at 
most  a  finite  number  of  these  solutions  satisfies  the  additional  constraints  (6c) 
and  (8).  Therefore, restricting  consideration  to  diffeomorphisms  satisfying  (6c)  and 
(8)  one  can  expect  that  there  will  be  no  need  for  actually  solving  the  differential 
equations.  As  Example  1  shows,  this  may  be  the  case  even  for  locally  unidentifiable 
systems.  Second,  the  use  of  the  local  state  isomorphism  theorem  restricts  the  scope 
of  Proposition  1  to  some  interval  [0,fil.  Howrv-v.  if  \  satisfies  (5)  and  (6)  for  all 
x  €  M,  then  the  transformation  is  global  and  / j  „•  0  is  arbitrary. 
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Remark  For  a  time-invariant,  structurally  controllable  and  structurally  ob¬ 
servable  parameterized  linear  system  represented  by  (A(p),  B(p),C(p))  we  have 
A(i )  =  Tx,  where  T  :  IT'  — *  ft™  is  a  nonsingular  linear  transformation[12j.  Then  the 
further  conditions  of  Proposition  1  are  reduced  to  T(8)xq(p)  =  xo (p),  A(p)T(0)  = 
T(6)A(p),  B(p)  =  T(8)B(p ),  and  C(p)T(0)  =  C(p),  where  8  denotes  the  entries  of 
T  and  emphasizes  that  8  is  to  be  determined  in  addition  to  p  in  order  to  satisfy 
the  equations. The  linear  system  is  globally  identifiable  at  p  iff  the  only  solution  is 
p  =  p,  and  then  T(0)  =  I  follows  as  mentioned  in  Remark  1.  This  agrees  with  the 
results  of  Walter[2,3,8j.  Since  state  diffeomorphisms  are  linear,  a  similar  identifi- 
ability  condition  can  be  formulated  for  bilinear  systems  with  a  linear  observation 
function[14]. 


III.  HOMOGENEOUS  SYSTEMS 

We  now  show  that  there  exists  a  more  general  class  of  systems  such  that  consid¬ 
erations  can  be  restricted  to  linear  state  transformations  when  solving  the  problem 
stated  in  Section  I.  The  result  is  based  on  the  following  lemma. 

Lemma  1:  Assume  that  /(-,p)  and  g{-,p )  are  defined  by  homogeneous  coordi¬ 
nate  functions,  i.e., there  exist  integers  k  and  £  such  that 

kf{x,p)  =  {df{x,p)/dx)x  ,  £g{x,p)  =  (dg{x,p)/dx)x  (14) 

at  ail  x  £  M;  and  the  observation  function  is  linear,  h(x,p )  =  C(p)x.  If  p  ~  p  in 
the  experiments  (0,(7[0,ti])  for  some  <j  >  0,  then  there  exists  a  nonsingular  linear 
transformation  T  :  FT1  — +  fT1  such  that  x(t,p)  =  Tx(i,p)  for  all  0  <  t  <  t\ ,  where 
x(t,p)  and  i{t,p)  denote  the  solutions  of  the  differential  equations  in  (1)  and  (2), 
respectively. 


e 


i 

i 
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Proof:  Introduce  the  notation  ip(x,u)  —  f{x,p)  +  ug(x,p)  and  <f(x,u)  = 
f(x,p)  +  ug(x,p).  By  Proposition  1  there  exists  an  open  neighborhood  1',  of  i0  =  0 
such  that  x  =  A(x)  on  V) .  Thus  we  can  write  (1)  and  (2)  as 

A  =  p(A,u),  A0  =  A(i0)  =  0  (15) 


and 

x  -  ip(x,u)  ,  xo  =  x0  —  0  (16) 

respectively.By  the  local  weak  controllability  of  (16)  at  £o  =  0,  there  exists  an  open 
neighborhood  V2  of  i0  such  that  any  x  £  V2  is  reachable  from  x0.  Therefore, the 
equality 

V  =  C{p)X{x)  =  C{p)x  (17) 

holds  for  all  x  £  V  =  Vj  D  l2.  Let  Cj  and  c}  denote  the  jth  rows  of  C(p)  and  C(p), 
respectively.By  (17)  for  any  x  £  V,  for  any  i  >  0,  any  constant  controls  u1 ,  ,u’ , 
and  sufficiently  small  si , . . .  ,st  >  0  we  have 


Cjbi,  0  •••  °%22  °7],  (*(*)))  =  Cj(5I,  o  .. 


o  7 2 


*2 


(18) 


for  j  —  1, . . .  ,m.  Here  7J  and  7J  denote  the  flows  of  ip*(A)  =  ip(A,u‘)  and  ip'(x)  = 
<p(x,ul),  respectively.  Differentiating  with  respect  to  ,s  j,  at  0  yields 


;  A)) . . -)>(*)  =  L^{...{L^{cjx))...)i  (19) 

where  L^\  denotes  Lie  differentiation  along  the  vector  field  [7].  Differentiating 
(19)  with  respect  to  x  and  multiplying  by  x  gives 

(d(Lvi(...(Lv,(c2  A))...)A(£),~ i)  =  (d{L^i(. .  .(L^icji)) . .  .)i,x)  (20) 

where  the  differential  d{L^\  (. . .  (£^.  (cjX  ) .  .  .  I*  is  repr-sented  by  a  row  m-vector 
valued  function, and  the  vector  field  with  th<  diunt*-  functions  (?],...,£„)  is 
also  denoted  by  x.  Assume  first  that  k  =  L  By  (14)  we  have 
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(21) 


Lpicji)  =  erf*  =  =  ^-{d(L^(cjx)),x). 


Then 


(Lf.-l(Lf.(cJr)))=  l(rf(^,-1(4(cJi))),i}+,(^.(cJi)),(r,,ii)  (22) 


where  [•,  ■]  is  the  Lie  bracket  of  two  vector  fields  (see,e.g.,[9]).  By  (14)  \<p'  1 ,  sc]  =  (1- 
k) i^,_1  ,  and  hence  the  second  term  on  the  rhs  of  (22)  is  (1  —  k)/k(L$>- 1  ( L ( Cji ))). 


Therefore, rearranging  (22)  yields 


(2Jt  -  l)(I*.-x(I*.(c,-i)))  =  (d(I^.-x(I#.(c, •*)))*, i) 


Since  <^>  is  a  (c-order  homogeneous  function  of  A,  analogously  rearranging  the  lhs  of 
(19)  and  continuing  for  i^’-2 , . . .  yields 


(d(Lvi  (. . .  (Lp,  (cj  A)) . .  .)>(£>,  A(x))  =  (<f(L*t(...(I*.(c;x)).  ..)*,*)•  (24) 


From  (20)  and  (24) 


(d(L^,  (. . .  (Tv.  (c;  A)) . .  — x)  -  ••  {L^{cj  A)) ..  •)*>*)•  (2^) 


Since  the  O.R.C.  is  satisfied  at  0,  by  analiticity  of  (1)  there  exists  an  open  neigh¬ 
borhood  of  0,  also  denoted  by  V,  such  that  the  vectors  d(L^\(. . .  (L^,(c}  A)  . .  .)*(#) 
span  an  n-dimensional  space  at  all  x  E  V  [7j.  Thus  equations  of  the  form  (25)  imply 


W-N 

—  (x)x  =  A(x) 

ax 


for  all  x  €  V.  By  (26),  dXjdx  is  a  zero-order  homogeneouos  function,  i.e., 


dX.  ~ t  dXt 


for  all  x  €  1  and  all  a  6  i?1  satisfying  aa"  -  I  .  and  by  0-1  it  is  defined  at 
x  =  0.  Setting  a  =  0  in  (27)  shows  that  dX/dx(0)  =  dX/dx(x)  at  all  x  <E  1  ,  and 


A(x)  =  Tx.  By  (14)  A (x)  =  Tx  satisfies  (5)  and  (6)  for  all  x  €  A/,  and  hence 
x(t,p)  —  Tx(t,p)  for  all  0  <  <  <  . 

For  l  <  m  introduce  the  additional  state  variable  xn4i,  an  (m  -  £)-order 
homogeneouos  function  r  :  Ip  — *  R1  ,  where  Ip  is  an  open  subset  in  R1  such  that 
r(s)  ^  0  for  all  s  €  Ip,  and  input  u  =  u/r(xn+i  ).Let  x*  =  (xT,xn+i)T  ,  x*  = 
(xT,xn+1)T,  and  consider  the  augmented  vector  field  :  M  x  Ip  — *  i?n+1  and 


matrix  C*  defined  by 


y?*(x*,u)  =  ^ 


f{x,p)  +  ur{xn+}  )g(x,p) 


),  <”=(?  !)■  (28) 


Augmenting  (1)  and  (2)  we  have 


x*  =  <p*(x*,ti),  y  =  Cmx\  xl  =  (xjf,/?)T, 


f*  =  ^(x*,i),  y  =  C*x‘,  i;  =  (i0T,/?)T,  (30) 

where  <p*  rind  C*  are  defined  by  (28)  at  the  parameter  value  p.  Let  £po(p)  and 
p  denote  the  input-output  maps  of  (29)  and  (30),  respectively. Since 

s^(p)(«)=  (E%(u))’  TO 

p  ~  p  implies  Sp°  ^(u)  =  S^°^(u)  for  all  u  6  f^[0,tj]  and  for  all  0  6 
Since  xn+i  =  x„+i  =  0,  in  spite  of  uncontrollability  of  (29)  and  (30),  the  only 
isomorphism  between  x*  and  x*  is  given  by  A*  :  x*  — ♦  (Ar(x),  xn+i ),  and  then 
y  =  C*A*(x*)  =  C*x*  for  all  x*  €  V  x  Ip.  The  augmented  systems  are  homoge¬ 
neouos, and  the  previous  part  of  the  proof  applies  and  yields 


(d\  (x)fdx  0\  /  X  \  /A(i)\ 

V  o  1  /  \  x„  +  1  /  \  i'n~  1  J 


This  is  valid  for  all  x  £  V,  and  thus  (26  i  and  ( 2  ■  '•  f<  >1 ! ■  *\v .  Fnr  m  •  (  the  pr< *<  >f  is 

analogous  with  u  =  ur(xn+] ).  ^ 
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Replacing  A(x)  by  Tx  in  (4)-(6)  gives  a  simple  condition  for  identifiability  of 
homogenous  systems  in  the  experiments  (0,  F[0,  <1  ]),  tj  >  0  arbitrary.  This  result 
can  be  slightly  extended  by  considering  a  set  /j  C  fT11,  nj  <  n  of  feasible  initial 
states  ip1'.  Define  Ip  =  {xf^p^XQ1*  £  h  }■  System  (1)  is  globally  identifiable  at 
p  6  fi  in  the  experiments  (/p<fr[0,<i|)  if  and  only  if  there  exists  io(p)  £  /p  such  that 
(1)  is  globally  identifiable  in  the  experiments  (xo(p),  Cr [0,  t\ ]).  FYom  Proposition  1 
and  Lemma  1  we  have  the  following  result. 

Proposition  t:  Assume  that  system  (1)  satisfies  the  assumptions  of  Lemma  1, 
it  is  locally  reduced  at  i0  (p)  for  all  p  6  D  ,and  for  all  x^  E  /i ,  and  0  G  Ip.  Consider 
p,p  £  11,  and  any  linear  transformation  T(0)  :  IT1  — *  iT*  such  that  (i)  T(0)xo(p)  — 
x0(p)  for  all  x[,3)  6  I\  ,  (ii)  T(6 )  is  nonsingular,  and  (iii)  f(T(0)x,p)  =  T(&)f(x,p), 
g(T($)x  ,p)  =  T{6)g(x  ,p),  and  C(p)T(0)  =  C(p)  for  all  x  £  M.  Then  (1)  is  globally 
identifiable  at  p  in  the  experiments  (/p,  f  '[0,  <j])  for  arbitrary  tj  >  0  iff  conditions 
(i),(ii),and  (iii)  imply  p  =  p  . 

Proof:  As  shown  in  the  proof  of  Lemma  1  for  xD  =  0,  x  =  Tx  is  a  global 
transformation  defined  at  all  x  6  A/  if  (1)  is  homogeneous.  Assume  that  T  satisfies 
the  constraint  (i).  By  analiticity  of  local  diffeomorphisms,  A(x)  =  Tx  for  any  xo(p) 
and  for  any  local  transformation  A  defined  on  some  open  neighborhood  of  x0(p), 
and  the  proposition  follows.  ^ 

The  identifiability  conditions  for  linear  and  bilinear  systems,  discussed  in  Re¬ 
mark  4  are  particular  cases  of  Corollary  1  with  m  =  1,  l  —  0,  and  m  =  1,  l  =  1, 
respectively.A  further  application, particularly  important  in  ecology  and  chemical 
reaction  kinetics,  is  to  the  system 

x,(f,p)  =  xT{t,p)Ail){p)x{l,p)  +  Mp)u  «  t  =  1,. . .  ,n, 

(33) 


y{t,p )  =  C(p)x(t,p) ,  xio.pl  =  x„(p). 


where  A*1*,  f  -  1,...  ,n,  are  n  x  n  symmetric  niatnces.  Denote  a*  * '  =  and 

9X}  =  [T(0)]t;  ,  i,j,fc  =  1,.. .  ,n,  then  we  have  the  following  result. 
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Corollary  1:  Consider  p,  p  €  ft  and  any  nonsingular  linear  transformation 
T{6)  :  if"  — ♦  iZ"  such  that  7\0)xo(p)  =  *o(p)  for  ^  fi» 


6(p)  =  r(d)6(p)  ,  C(p)7\0)  =  C(p)  (34) 

n  n  n 

^2  12  a”)(p)9r>6”  =  12  a»;r)(p)^>  =  f»  •  ■  •  >n-  (35) 

r=l  J=1  r=l 

System  (33)  is  globally  identifiable  at  p  in  the  experiments  (/p,f/[0,<i]),  where 
0  £  Ip,  and  t]  >  0  arbitrary,  iff  the  above  conditions  imply  p  =  p. 

Remark  5:  As  mentioned  in  Remark  1, global  identifiability  also  implies  T(9)  — 

I. 


Example  2:  Consider  the  system 


x3  =  pjXjx2  +  u  ,  xi(0,p)  =  x2(0,p)  =  0  , 

x2  =  p2  xj  x2  +  p3  x2  +  u  y  =  x2  , 


which  is  of  the  form  (33)  with 


(36) 


'"’w  =  (Pl°/2  P,o/2 )  ■  '•"’w  -  (*%  "»)  -  *(rt  -  (1 )  •  <’■> 

and  C  =  [0  1].  It  is  easy  to  show  that  (36)  is  locally  reduced  at  x0  =  0  for 
all  p  £  ft  =  {p  £  R3',Pi  ±  0,p}  -  p2  -  pa  ±  0,  (p,|  <  K  £  R',K  >0}.  By- 
Proposition  2  we  consider  the  nonsingular  linear  transformations  T(6)  :  R2  — *  R2 . 
Since  6(p)  =  6(p)  and  C(p )  =  C(p),  conditions  (34)  restrict  T(6)  to  the  form 


n>)  =  ( 1  09” 


9r) 


(38) 


and  T(6)  is  nonsingular  for  any  On  ^  1-  Then  the  nontrivial  equations  in  (34)  are 

Pi  —  ®12(Pl  —  P2  )  =  Pi  —  ^12 Pi  i  ^12?3  =  $12  Pi  i  P2  =  ?2  ~  $12P2  1  and  P3  =  $12?2  +  P3  • 
At  612  =  0,  p  =  p  and  T(6)  ~  I.  Except  pi  —  pi  th»*r*>  exists,  however, a  second 
solution  012  =  (Pi  -  P3 )/ P2 1  which  yields  p,  =  p3  .  /»2  -  p2  p3  -  Pi  ,  and  p,  =  p|. 
Therefore, the  system  is  locally  identifiable  at  all  p  £  ft,  but  it  is  globally  identifiable 


12 


e 


only  on  the  subset  {p  £  Q;  p\  =  pj }  of  zero  measure  in  ft.  We  note  that  by  the 
lack  of  applicable  necessary  conditions  for  global  identifiability,  (36)  is  the  first  such 
nonlinear  system  presented  in  the  literature  (see,e.g.,(l],[2],[3],[6],[ll],[15j). 

By  Corollary  1  we  can  study  also  identifiability  of  (36)  with  nonzero  initial 
conditions. In  the  most  general  case  Xi(0,p)  =  p4  and  2:2 (0,p)  =  ps  are  additional 
parameters.  By  condition  (i)  of  Proposition  2  the  additional  constraints  on  (38)  are 
(1  _ ff12)p4  +0i2Ps  =  Pi  and  p5  =  ps-  It  follows  that  ps  is  unique  and  there  exist  two 
solutions  for  p4.  The  system  becomes,  however,  globally  identifiable  at  all  p  €  H 
if  p4  =  To,i  is  known  and  there  exists  a  point  *o,i  i1  0  in  1\  (i.e.,  the  constraint 
p4  =  p4  implies  012  =  0  ). 
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Abstract 

By  following  the  kinetics  of  a  reaction  through  the  use  of  certain  classes  of 
measurable  quantities  instead  of  the  concentrations  of  all  species  neither 
the  parameter  values  nor  the  reaction  scheme  are  necessarily  unique. 

Identif iability  deals  with  the  problem  of  determining  whether  an  experiment 
is  able  to  supply  the  desired  information  on  the  parameters  of  an  assumed 
kinetic  model,  whereas  indistinguishability  means  that  two  different  reaction 
schemes  generate  the  same  values  for  the  observed  quantities  in  any  possible 
experiment.  This  paper  examines  these  issues  for  the  case  of  first-order 
reaction  systems  and  both  problems  are  solved  by  the  same  analytical  tools. 
The  method  involving  Laplace  transforms  is  conceptually  simple,  easy  to 
apply,  and  is  also  used  to  derive  simple  rules  to  test  distinguishabilitv  of 
reaction  schemes.  Another  approach  based  on  similarity  transformations  is 
used  to  generate  all  the  first-order  reaction  schemes  that  are 
indistinguishable  from  a  given  one. 
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I . Introduction 

Kinetic  experiments  are  often  conducted  under  conditions  such  that  the 
reactions  are  first-order  or  pseudo  first-order,  with  rate  coefficients 
proportional  to  the  concentration  of  a  reaction  partner  in  large  excess. 
Interpretation  of  experimental  data  by  postulating  a  mechanism  and  adjusting 
the  values  of  some  unknown  parameters  has  received  due  attention  in  the 
literature2*6.  The  problems  usually  considered  are  techniques  of  parameter 
estimation  and  statistical  interpretation  of  the  estimates  in  terms  of 
confidence  intervals  or  joint  confidence  regions.  Kinetists  are  aware  that 
there  remain  further  fundamental  questions  to  ask.6  First,  are  the  derived 
parameters  unique,  or  are  there  further  parameter  sets  generating  the  same 
values  for  the  observed  quantities?  Second,  is  the  selected  model  the  only 
plausible  one  which  will  give  an  acceptable  fit  to  the  data?  These  questions 
of  parameter  and  model  uniqueness  are  not  trivial  even  for  very  simple 
mechanisms  if  not  all  concentrations  are  directly  observed. 

For  example,  consider  the  consecutive  reaction  scheme 

A  -  B  -  C  (1.1) 

studied  in  several  works3*5  assuming  that  initially  only  A  is  present  in  the 
system  and  the  reaction  is  followed  by  observing  the  single  property 

y  -  «a[A)  +  «|[B]  +  ec[C)  (1.2) 

which  may  represent  absorbance,  conductivity,  pH,  or  ligand  release.  We 
regard  y  as  absorbance  and  ,  t6  and  ec  as  molar  extinction  coefficients. 
Frequently  the  intermediate  species  B  cannot  be  isolated  and  separately 
investigated,  hence  eB  is  an  additional  parameter  to  be  estimated 
simultaneously  with  the  rate  coefficients  kj  and  k2  from  the  time-absorbance 
data.  As  is  well  known,3*5  under  these  conditions  the  solution  of  the 
estimation  problem  is  not  unique  because  of  the  slow-fast  ambiguity,  thus  for 


e 
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any  solution  k-(k1 ,  k2 ,  «B)T  there  exist  a  second  solution  k—  C k1  ,  k2  ,  t B ) 
given  in  terms  of  k  by 

k,  -  k2  ;  k2  -  k,  ;  TB  “  +  k,  («B -tA  )/k2  .  (1.3) 


In  addition  to  nonuniqueness  in  parameter  values  there  may  be  ambiguities  in 
the  model  structure.  As  emphasized  by  Milligan  et  al.,5  a  qood  fit  does  not 
necessarily  mean  that  the  model  is  correct,  since  there  exist  further 
reaction  schemes  generating  the  same  absorbance  curve.  They  mention  the 
schemes 


(1.4) 


whereas  Jackson  et  al.4  claim  that  the  absorbance  data  can  equally  be 
described  by  adopting  the  reaction  schemes 


The  purpose  of  this  paper  is  to  present  a  systematic  and  rather  general 
analysis  of  the  problems  of  parameter  uniqueness,  called  identif iability ,  and 
distinguishability  of  different  first-order  reaction  schemes.  While 
identif iability  has  received  a  fair  amount  of  attention  in  application  areas 
such  as  automatic  control,7  compartmental  modeling,8  and  chemical 
engineering,9  in  chemical  kinetics  results  have  been  restricted  to 
discovering  nonuniqueness  of  the  parameters  in  particular  reaction  systems 
through  the  use  of  methods  of  limited  applicability.  Similarly,  the  general 
results  available  on  distinguishability  are  rarely  applied  to  kinetic 
models, (9c )  though  without  systematic  analysis  mistakes  can  be  made.  We  will 
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show  that  the  schemes  S2 ,  S3 ,  and  S5  in  (1.4)  and  (1.5)  are,  in  fact, 
distingishable  from  the  one  in  (1.1),  whereas  there  exist  indistingushable 
schemes  overlooked  in  previous  studies  of  this  simple  system  (for  the 
illustrative  purposes  of  this  paper  we  shall  assume  a  measurement  of  the  form 
in  (1.2)  in  most  examples). 

Identif iability  and  distinguishability  are  so  closely  related  that  all 
the  required  machinery  is  introduced  by  discussing  the  first  and  somewhat 
simpler  problem.  Identif iability  concepts  are  also  needed  to  properly 
understand  some  distinguishability  results.  For  example,  we  show  that  (1.1) 
and  the  scheme  S,  in  (1.4)  are  indistinguishable,  but  their  identifiability 
properties  are  substantially  different,  since  using  the  latter  model  the 
desired  absorbance  curve  can  be  generated  at  infinitely  many  different 
parameter  values. 

We  will  regard  two  reaction  schemes  as  indistinguishable  if  and  only  if 
they  generate  exactly  the  same  values  for  the  observed  quantities  (e.g.,  for 
absorbance  in  the  case  discussed  here)  and  hence  employ  a  deterministic 
framework  by  restricting  considerations  to  idealized  experiments  with  the 
ability  of  observing  the  measurable  variables  at  any  instant  of  time 
error-free.  Deterministic  identifiability  is  a  fundamental  property  of  a 
kinetic  model,  since  unidentif iability  in  this  idealized  experiment  implies 
unidentif iability  in  any  realistic  experiment  with  constraints  on  sampling 
and  measurement  accuracy.  Similarly,  models  indistinguishable  in  the 
deterministic  sense  remain  indistinguishable  in  any  experiment  involving  the 
same  measurable  quantities.  It  should  be  emphasized  that  the  deterministic 
analysis  is  only  the  first  step  in  establishing  uniqueness  of  parameter 
estimates  or  uniqueness  of  a  kinetic  model.  In  fact,  inadequate  design  of 
the  identification  experiment  and/or  large  measurement  errors  may  result  in 


highly  uncertain  estimates  even  for  the  parameters  of  an  identifiable  model. 
Similarly,  a  set  of  noisy  observations  may  be  compatible  with  the  responses 
of  several  models  in  spite  of  their  deterministic  distinguishability .  Since 
the  analysis  of  these  problems  requires  assumptions  on  the  experiment  design, 
on  the  structure  of  measurement  errors,  and  on  the  values  of  the  parameters, 
it  can  usually  be  performed  only  a  posteriori  after  carrying  out  the 
experiment  and  estimating  the  parameters.  The  deterministic  analysis  is, 
however,  an  a  priori  procedure  for  detecting  a  fundamental  class  of 
ambiguities,  thereby  assisting  the  selection  of  possible  models  and  the 
variables  to  be  observed  in  the  intended  experiments. 

The  paper  is  organized  as  follows.  In  Section  II  we  introduce  the 
concepts  of  deterministic  identif iability  and  offer  two  general  methods  of 
analysis  based  on  Laplace  and  similarity  transformations,  respectively.  The 
Laplace  transformation  approach  is  also  used  to  study  distinguishability  in 
Section  III  and  enables  us  to  formulate  a  number  of  propositions,  thereby 
considerably  facilitating  the  required  algebraic  manipulations.  The 
similarity  transformation  approach  is  fully  exploited  in  Section  IV  offering 
a  procedure  to  generate  all  first-order  reaction  schemes  that  are 
indistingishable  from  a  given  one.  In  particular,  results  are  peresented  for 
the  scheme  in  (1.1).  The  methods  can  be  most  easily  understood  by  solving 
simple  problems  and  we  present  a  number  of  examples  for  this  purpose. 

II.  Identif iability 

A  first-order  reaction  scheme  under  isothermal  condition  gives  rise  to 
the  kinetic  equations  of  the  general  form 


x ( t , k)  -  A(k)x(t,k) 


x(0 ,k)  -  Xq (k)  - 


x0r2)(k) 


(2.1) 


where  x(t,k)  is  the  n-vector  of  concentrations,  depending  on  the  p-vector  kffi 
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of  unknown  parameters,  OcRp  representing  the  set  of  possible  parameter  values. 
We  assume  that  0  is  a  bounded  open  set  in  Rp ,  thus  the  parameters  are  a 
priori  independent  and  restricted  only  by  inequality  constraints  (e.g.,  by 
nonnegativity  of  the  rate  constants).  In  the  initial  concentration  vector  xc 
we  distinguish  between  the  components  in  xp ) ,  selected  to  specify  an 
experiment,  and  those  in  x^2 ) ,  depending  on  unknown  parameters  (e.g.,  initial 
conditions  in  x^2 )  can  be  parameters  themselves).  In  addition  to  the  kinetic 
equations  (2.1)  our  model  consists  of  the  linear  observation  function 

y(t.k)  -  C(k)x(t,k)  (2.2) 

where  y(t,k)  is  the  m-vector  of  observable  quantities,  also  called  the 
response  function  of  the  model.  As  is  seen,  the  observation  matrix  C(k)  may 
also  depend  on  unknown  parameters. 

Consider  a  kinetic  experiment  specified  by  the  initial  concentrations 
x^1 )  and  let  y(t)  represent  the  response  function  observed  over  some  time 
interval  T.  The  basic  assumption  of  deterministic  analysis  is  the  existence 
of  a  nominal  parameter  value  kefi  such  that  y(t,k)-y(t)  and  this  function  can 
be  observed  at  all  t«T  error-free.  Two  parameter  values  k  and  k*k  are 
indistinguishable  in  the  considered  experiment  if 

y(t,k)  -  y(t,k)  (2.3) 

for  any  teT.  The  analysis  of  identif iabil ity  is  based  on  eq.  (2.3)  and  the 
following  situations  can  be  encountered: 

(i)  if  the  solution  k-k  of  (2.3)  is  unique,  the  model  (2.1)-(2.2)  is  said 
to  be  uniquely  identifiable  at  k«H; 

(ii)  if  there  exist  at  most  a  finite  number  of  distinct  solutions  k*k ,  the 
model  is  said  to  be  identifiable  at  k; 

(iii)  finally,  with  an  infinite  number  of  solutions  in  (2.3)  the  model  is 
said  to  be  unidentifiable. 
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Since  the  nominal  parameter  value  k  is  not  known  a  priori,  the  above 
concepts  should  be  generalized.  It  would  be  easy  to  require  identif iability 
at  every  keO.  In  most  models,  however,  there  exist  exceptional  points  or 
lower  dimensional  surfaces  in  0  where  the  model  is  unidentifiable  in  spite  of 
its  identif iability  at  the  majority  of  points.  Properties  that  hold  at 
almost  every  point  of  the  parameter  set  are  usually  called  structural 
ones.10  Therefore,  the  model  is  said  to  be  structurally  identifiable 
(uniquely  structurally  identifiable)  if  it  is  identifiable  (uniquely 
identifiable)  at  almost  every  kcft,  thus  except  at  the  points  of  a  set  of 
measure  zero  in  CJ.  As  shown  in  our  examples,  the  existence  of  such 
exceptional  subsets  does  not  decrease  practical  utility  of  the  concepts. 

In  view  of  the  extensive  list  of  publications7 '9  on  the  identif iability 
problem  we  restrict  considerations  to  two  basic  methods  of  analysis  enabling 
one  to  test  first-order  reaction  systems  of  moderate  complexity  without 
programming  efforts.  Both  methods  will  also  be  needed  when  studying 
distinguishability  of  different  schemes. 

1.  Laplace  transformation  approach 

Taking  the  Laplace  transform  of  the  differential  equations  (2.1)  we 
obtain 

sX(s,k)  -  A(k)X(s,k)  +  x0(k)  (2.4) 

where  X(s,k)  is  the  transform  of  the  concentration  vector  x(t,k)  and  s 
denotes  the  complex  argument.11  Taking  also  the  transform  of  (2.2)  and  using 

(2.4)  gives  the  Laplace  transform 

Y(s ,k)  -  C(k) [ si -A(k) ] * 1  x0 (k)  (2.5) 

of  the  response  function  y(t,k).  Note  that  in  spite  of  the  general  formula 

(2.5)  the  linear  equations  (2.4)  can  be  solved  for  X(s,k)  by  the  elimination 
technique  and  no  matrix  inversion  is  necessary  to  obtain  Y(s,k). 


(2.6) 
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Since  (2.3)  is  satisfied  for  all  te  T  if  and  only  if 

Y(s,k)  -  Y(s,k) 

for  all  se  <p,  where  $  is  the  field  of  complex  numbers,  we  can  restrict 
considerations  to  (2.5)  without  solving  the  kinetic  equations.  Each 
component  Yj(s,k)  of  the  m-vector  Y(s,k)  is  a  rational  function  of  the  form 


Yi (s.k) 


&'  sn  '1  +  .  . .+  6s 
wn+1  ™2  n 


Sr'+4>-i 


i  c  n  - 1 


+•••+  *n 


(2.7) 


where  the  coefficients  <£]  generally  depend  on  k  and  x^1 ) .  After  simplifying 
the  possible  common  factors  between  the  numerator  and  the  denominator  poly¬ 
nomials  in  (2.7),  the  vector  <f>  of  "moment"  invariants  is  formed  by  all  dif¬ 
ferent  coefficients  in  Y1 (s ,k) , . . . ,Ym (s ,k) .  Since  (2.6)  holds  if  and  only  if 


*(k)  -  *(k). 


(2.8) 


the  analysis  of  identifiability  is  reduced  to  the  problem  of  determining  the 
number  of  solutions  in  the  set  (2.8)  of  polynomial  equations . 1 0 • 1 2  The 
following  examples  demonstrate  the  simplicity  of  the  method  and  the  presence 
of  exceptional  subsets  of  zero  measure,  taken  into  account  in  our  definitions 
Example  1.1.  Consider  the  reaction  scheme  St  in  (1.9)  with  the  response 
function  (1.2)  and  initial  concentrations  [B]0— [C]0  —0 .  The  Laplace  transform 
of  (1.2)  is  given  by 


Y(s.k)  - 


V2  +  [eA(k-1+V  +  («VS  *  {Ck1k2 

s3  +  (k^  +k_  t  +k2  )  sz  +  k^kjs 


[A], 


(2.9) 


Since  7A  -  fA  and  7C  -  ec  are  known,  the  independent  equations  of  the  form 
(2.8)  are 


«A(k.i+k2)  +  «Bk,  -  <A(k_.,+k2)  +  tgk. 


(2.10a) 


1  -V-  ■ 


f] 


k,  +  k.1  +  k2  -  k,  +  k.  1  +  k2 


(2.10b) 


ki  k2  -  k1  k2 


(2.10c) 


(a)  Assume  first  that  eB  -  «B»*eA  also  known.  Then  (2.10a)  and  (2.10b) 


give  k, -k,  and  k. 1 +k2-k. ! +k2 .  Using  now  (2.10c)  we  obtain  the  unique  solution 


k-k.  There  are,  however,  an  infinite  number  of  solutions  if  k^ -0  or  k2-0. 
These  exceptional  points  form  two  planes  in  R3 ,  and  thus  are  sets  of  zero 


measure  and  hence  the  model  in  uniquely  structurally  identifiable  if  eB  is 


known . 


(b)  Consider  the  more  general  case  with  parameters  k**^ .k., ,k2 , «B )T . 


Since  (2.10)  consists  only  of  three  equations  to  determine  four  parameters, 


the  model  is  unidentifiable. 


Example  2.2.  It  is  easy  to  show  that  the  scheme  in  (1.1)  is  structurally 


identifiable,  but  not  uniquely.  The  Laplace  transform  of  (1.2)  is  given  by 


Y(s,k)  - 


«Asz  +  (fAk2  +  tBk,  )s  +  «ckik2 


s3  +  (k1+k2)s  +  k1k2s 


(2.11) 


and  with  known  fA  -  tA  and  <c  -  tc  the  independent  equations  of  the  form 


(2.8)  are 


f  A  k2  +  <8k1  *  «Ak2  +  <Sk1 


k1  +  k2  -  k^  +  k2 


(2.12) 


k,k2  -  k,k2 


which  clearly  admit  the  second  solution  (1.3).  The  exceptional  subsets  are 


again  ^-0  and  k2-0,  where  the  model  is  unidentifiable.  Notice  that  eB  given 


•  v ' 
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by  (1.3)  may  be  negative  and  since  this  is  clearly  unphysical  the  ambiguity 
is  resolved  in  certain  cases,  depending  on  the  value  of  the  parameters  k. 

In  Example  2.1  with  eB  unknown  we  have  more  parameters  than  equations  and 
hence  unidentif iability  follows  immediately.  Though  (2.8)  generally  contains 
at  least  as  many  equations  as  parameters,  the  analysis  of  structural 
identif iability  is  very  simple. As  shown  by  Vajda  using  the  implicit  function 
theorem,13  the  model  is  structurally  identifiable  if  and  only  if  rank  J(k)=p 
for  some  kffi,  where  J-3<£/3k  denotes  the  Jacobian  matrix  of  <f> ,  and  p  is  the 
number  of  parameters.  The  condition  is  met  if  and  only  if  det  J(k)  (or  its 
principal  minors  in  case  of  a  nonsquare  matrix)  do  not  identically  vanish.  If 
rank  J(k)-q<p  for  all  kefi,  then  one  can  select  p-q  parameters  such  that  by 
fixing  their  values  the  model  becomes  structurally  identifiable  with  respect 
to  the  remaining  free  parameters.  Therefore,  the  integer  q-rank  J (k)  is 
called  the  number  of  determinable  parameters  and  will  play  an  important  role 
in  further  sections.  As  shown  in  Example  2.1,  the  number  of  determinable 
parameters  is  3,  since  the  model  is  identifiable  with  eB  fixed. 

Remark  2.1.  Since  the  elements  of  J(k)  are  analytic  functions  of  the 
parameters,  rank  J(k)  achieves  its  maximum  value  at  almost  every  keO. 
Throughout  the  paper  rank  J(k)  denotes  this  maximum  or  "generic”  rank  of 
J(k). 

Remark  2.2.  Though  identif iability  properties  have  been  defined  for  a  single 
experiment  specified  by  the  initial  concentrations  x0^^  the  analysis  can 
easily  be  extended  to  a  set  of  experiments  with  initial  conditions  x^^IcR". 
Indeed,  the  elements  of  J  are  also  analytic  functions  of  the  components  in 
xo^1)  and  similarly  to  Remark  2.1,  structural  identif iability  at  a  single 
Xq  c  I  implies  identif  iability  at  almost  every  Xp^^el.  There  can  exist, 
however,  exceptional  points  where  identif iability  is  lost,  e.g.,  selecting  a 
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stationary  state  as  initial  concentrations  in  the  experiment. 

Remark  2,3.  The  number  of  components  in  the  "moment"  invariant  vector  4>  is 
at  most  2mn,  which  is  an  upper  bound  on  the  number  of  determinable  parameters. 
Following  the  reaction  by  the  use  of  a  single  quantity  and  considering  only 
rate  coefficients  as  unknown  parameters  the  upper  bound  is  2n-l. 

Though  generating  the  Laplace  transform  (2.5)  of  the  response  function 
is  usually  not  very  tedious,  it  can  considerably  be  simplified  by  taking 
advantage  of  the  specific  method  proposed  by  Bossi  et  al.'4  As  discussed, 
for  testing  structural  identif iability  we  also  need  the  Jacobian  matrix  84>/d k 
and  its  determinant  (or  principal  minors),  which  can  easily  be  evaluated. 

The  analysis  of  unique  structural  identif iability  requires,  however,  the 
solution  of  the  polynomial  equations  (2.8).  It  should  be  emphasized  that 
this  step  may  be  considerably  more  difficult  than  in  Example  2.2,  where 
nonuniqueness  follows  from  interchangeability  of  the  rate  coefficients.  As 
shown  by  Norton1 5  in  his  exhaustive  analysis  of  first-order  reaction  schemes 
(linear  compartmental  models)  with  3  species,  sources  of  nonuniqueness  are 
generally  more  subtle  and  the  functions  relating  the  different  solutions  more 
complicated.  While  symbolic  languages  such  as  REDUCE  and  algebraic 
manipulation  subroutines  are  valuable  tools  in  solving  the  polynomial 
equations,16  with  some  persistency  the  solution  can  also  usually  be  obtained 
by  hand. 

2.  Similarity  transformation  approach 

This  method  is  based  on  introducing  the  new  variables  x  defined  by  x-Tx 
into  (2.1)  and  (2.2),  where  T  is  an  n  x  n  nonsingular  matrix.  The  transformed 
system  is  then  described  by 

x(t,k)  -  T_1 A(k)Tx( t , k) ,  x0(k)  -  T-1x0(k) 


y(t.k)  -  C(k)Tx(t ,k) . 


(2.13) 
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By  the  algebraic  equivalence  theorem  of  linear  system  theory1 7  a 
similarity  transformation  does  not  change  the  response  function,  thus 
y(t,k)-y(t,k) .  Let  f  denote  the  vector  formed  by  the  entries  in  T  and 
introduce  the  notation  T-T(f).  Since  T  is  arbitrary,  the  elements  of  f  are  a 
priori  free  with  the  only  constraint  det  Tff)^.  While  the  response  is 
invariant,  the  system  matrices  and  initial  conditions  are  changed  according  to 

A(k,f)  -  T'1 (f)A(k)T(f) ,  (2.14a) 

C(k,f)  -  C(k)T(f) ,  (2.14b) 

and 

*0(k,f)  -  T*1 (f)x(k) ,  (2.14c) 

where  A,  C,  and  x0  depend  on  f  in  addition  to  the  original  parameters  k.  Now 
we  check  how  the  knowledge  of  the  system  structure  restricts  the  possible 
values  of  f.  For  simplicity  assume  that  C  and  x0  are  completely  known  (i.e., 
do  not  depend  on  unknown  parameters).  Then  C-C  and  x0  -x0 ,  thus  (2.14b)  and 
(2.14c)  imply  the  constraints 

C-CT(f),  x0-T(f)x0.  (2.15) 

Further  constraints  follow  from  the  structure  of  the  matrix  A.  If  ajj(k)-0, 
then  we  also  require  ajj(k,f)-0,  where  a,- j  and  a(  j  denote  the  corresponding 
entries  in  A  and  A,  respectively.  All  constraints  form  a  set  of  equations 
for  the  parameters  f.  This  set  always  admits  a  nominal  solution  f°  such  that 
T(f°)-I  and  the  transformations  (2.14)  yield  the  original  system  matrices. 

The  existence  of  a  second  solution  f*f°  means,  however,  that  the  knowledge 
of  the  response  function  y(t,k)  and  all  the  available  structural  constraints 
does  not  specify  the  transformation  matrix  T(f)  and  hence  A,  C,  and  x0 
uniquely;  thus  the  model  is  not  uniquely  structurally  identifiable. 


« 


o 


* 
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Similarly,  an  infinite  number  of  solutions  for  f  shows  unidentif lability  of 
the  model. 

If  C  and/or  also  depend  on  unknown  parameters,  constraints  of  the 
form  (2.15)  do  not  apply  but  we  still  have  some  constraints  on  f  from  the 
partial  knowledge  of  C  and  .  A  formal  description  of  the  method  is  rather 
lenghty18  but  it  can  easily  be  understood  with  the  aid  of  the  following 
example . 

Example  2.3.  We  use  the  similarity  transformation  approach  to  solve  the 
simple  identifiability  problem  studied  in  Example  2.2.  The  reaction  scheme 
in  (1.1)  is  described  by 


A(k)  - 

-kt  o  o 
k,  -k2  o 

,  x0(t)  - 

’  *0  ,1 
o 

o  k2  o 

o 

C  “  I  eA  lB  (cl 

where  x^j  «[A]0.  The  transformation  matrix  is  a  priori  arbitrary,  thus 


T(f )  - 


f,  f2  f3 

U  f5  f6 

.  f7  f8  f9  . 


(2.17) 


with  the  only  constraint  det  T(f)*<0.  Since  x0 (k)-Xg  is  completely  known,  the 
constraint  in  (2.15)  applies  and  yields  fj-1,  f4-0,  and  f7-0.  According  to 
(2.14b)  the  transformed  observation  matrix  C(k,f)-(tA  <B  tc)  is  given  by  the 
elements 

”  *A  •  *B  “  ^2 f A  +  ^5*8  +  ^8*0*  eC  ”  ^3fA  +  ^6eB  +  ^9  4  C  (2.18) 

Since  7C  -  cc  is  known,  (2.18)  gives  f3-0,  f6-0,  and  f9-l ,  thus  using  the 
knowledge  of  x0  and  partial  knowledge  of  C  we  end  up  with  the  transformat  ion. 


matrix 
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1  f2  o 

T(f)  -  o  f5  o 

.  O  f8  1  _ 

(2.19) 

Apply  now  (2.14a)  to  form  the  transformed  matrix  A(k,f). 

Using  the  well 

known  formula1 9  T^-Cadj  T)T/(det  T)  we  obtain 

r  i  -f2/f5  o ' 

T"  ( f )  -  o  1/f5  o 

.  o  -h/h  1  . 

(2.20) 

and  hence 

-kj(14f2/f5)  -k1  f2  (1 +f2/f5  )  +  k2  f2 

o 

A(k, f )  - 

ki/f5  ^fj/fj-kj 

o 

(2.21) 

.  -k1  fa/f5  “  f  8  ^2  /^5  +  k2  fg  +  k2fj 

0 

Since  for  the  scheme  (1.1)  this  matrix  should  be  of  the  form  (2.16),  the 

constraints  imposed  on  (2.21)  are  as  follows: 

fl3l(k,f)  “  * k^  fg  / f 5  ”  0 

(2.22a) 

a12(k,f)  “  *^2^1*^2  +  ^2  /  ^5  ^ 

-  0 

(2.22b) 

a2  ,  (k,  f )  -  -an  (k,  f)  -  k, 

(2.22c) 

a32 (k, f)  -  -a22 (k, f)  -  k2 

(2.22c) 

Eq.  (2.22a)  implies  fe-0,  whereas  (2.22b)  admits  two  solutions 

given  by 

f2  -  0 

(2.23a) 

and 

f2/f5  -  <k2  -k1  )/k1 

(2.23b) 

Substituting  (2.23a)  into  (2.22c)  and  (2.22d)  gives  f5-l. 

thus 

T-I  and  we 

find  the  nominal 

solution  f°  that  yields  the  original  system. 

(2.23b)  gives 

f5-k1/k2  and  using  (2.22c),  (2.22d),  and  (2.18)  one  obtains  the  second 

solution  (1.3)  for  the  parameters. 

In  this  example  the  similarity  transformation  approach  requires  more 
calculations  than  the  one  based  on  Laplace  transforms.  Notice,  however,  that 
in  Example  2 . 2  we  had  to  solve  a  quadratic  equation  to  obtain  the  two 
solutions,  whereas  a^  2  in  (2.22)  is  the  product  of  two  factors.  In  more 
complex  cases,  j  solutions  for  the  parameters  frequently  imply  that  (2.8)  is 
reduced  to  a  single  polynomial  equation  of  degree  j  and  numerical  methods  may 
be  required,  whereas  in  the  similarity  transformation  approach  we  may  have 
fewer  variables  (i.e.,  the  elements  of  f  remaining  free  after  requiring 
invariance  of  C  and  x0  )  and  equations  of  somewhat  simpler  structure. 
Furthermore,  only  the  latter  method  enables  one  to  generate  all  reaction 
schemes  indistinguishable  from  a  given  one.  In  particular,  we  will  use  the 
matrix  (2.21)  to  solve  this  problem  in  Section  IV. 

Remark  2.4.  Assume  that  taking  into  account  all  the  available  constraints 
there  remain  r  free  variables  f1(...,fr  in  A(k,f).  Then  the  model  is 
unidentifiable  and  r  further  constraints  (e.g.,  fixed  values  for  r 
parameters)  are  required  to  render  the  model  structurally  identifiable. 
Therefore,  p-r-q  is  the  number  of  determinable  parameters,  defined  previously 
by  q-rank  d<f>/ 3k.  This  simple  rule  will  be  frequently  used  in  the  following 
sections . 

Ill .  Distinguishability 

In  this  section  we  consider  two  different  reaction  schemes  denoted  by  S 
and  S,  respectively.  Both  are  described  by  models  of  the  form  (2.1)  - 
(2.2).  Let  y(t,k)  and  y(t,k)  represent  their  response  functions  with  the 
initial  conditions  x^1 ^-x^1 ^  and  parameters  kefi  and  kef),  respectively,  where 
0  and  n  are  the  parameter  sets.  Extending  the  concept  of  parameter 
indistinguishabil ity  introduced  in  Section  II  to  two  different  models,  values 
k«n  and  k«Q  are  said  to  be  indistinguishable  if 


e 
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y(t,k)  -  y(t,k)  (3.1) 

for  all  tcT.  Since  the  values  of  k  and  k  are  not  a  priori  known,  as  a 
further  generalization  of  the  concept  the  models  S  and  S  are  said  to  be 
indistinguishable  if  for  almost  every  parameter  value  ktO  of  the  model  S 
there  exists  an  indistinguishable  value  k«n  of  S  and  vice  versa,  thus  for 
almost  every  kefi  of  S  there  is  an  indistinguishable  parameter  kffl  of  S.  Ther. 
with  appropriate  selection  of  parameters  both  models  generate  the  same  family 
of  response  functions  which  corresponds  to  the  usual  meaning  of  indistinguish- 
ability.  Since  the  property  is  an  equivalence  relation  between  the  parameter 
sets  0  and  0,  indistinguishable  models  have  also  been  called  structurally 
equivalent.20 

Let  Y(s,k)  and  Y(s,k)  denote  the  Laplace  transforms  of  y(t,k)  and 
y(t,k),  respectively.  Eq.  (3.1)  is  satisfied  if  and  only  if 

Y(s,k)  -  Y(s,k)  (3.2) 

for  all  seC.  The  components  of  Y(s,k)  and  Y(s,k)  are  rational  functions  of 
the  form  (2.7)  and  for  each  nonzero  coefficient  (k)  of  s^  in  the  numerator 
(denominator)  polynomial  of  Y^(s,k)  there  should  be  a  corresponding  nonzero 
coefficient  ^J-(k)  of  s^  in  the  numerator  (denominator)  polynomial  of 
Y,(s,k).  In  this  case  Y(s,k)  and  Y(s,k)  are  said  to  be  of  the  same  symbolic 
form.21  The  same  symbolic  form  is  a  necessary,  but  not  sufficient  condition 
for  indistinguishability ,22  It  implies,  however,  that  listing  the 
corresponding  coefficients  in  ^  and  ?  in  the  same  order  we  can  proceed  to  the 
analysis  of  the  polynomial  equations 


*(k)  -  *(k)  (3.3) 

£  where  4>  and  %  denote  the  vectors  of  "moment"  invariants  for  S  and  S, 


respectively.  To  establish  indistinguishability  one  has  to  solve  (3.3)  both 
for  k  in  terms  of  k  and  for  k  in  terms  of  k.  The  models  are 


e 
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indistinguishable  if  and  only  if  both  solutions  exist  at  almost  every  keO  and 
k«fl,  respectively.  Frequently  these  solutions  exist  only  over  some  open  sets 
fycO  and  fycfi,  then  S  and  S  should  be  restricted  to  these  subsets  to  ensure 
their  indistinguishability . 

Example  3 . 1 . (a)  In  (1.4)  and  (1.5)  we  listed  reaction  schemes  claimed  to  be 
indistinguishable  from  (1.1).  With  initial  concentrations  [B]0— [C]0  — 0  the 
Laplace  transform  of  the  response  function  for  the  latter  model  is  given  by 
(2.11).  Evaluating  Y(s,k)  for  model  S3  in  (1.5)  we  can  immediately  conclude 
that  it  has  a  different  symbolic  form  and  hence  the  two  models  are 
distinguishable . 

(b)  Now  we  test  distinguishability  of  the  model  S2  in  (1.4)  from  (1.1), 
denoted  here  by  S.  For  S2  we  have  the  Laplace  transform 


Y(s,k) 


V' 


s 


+(sVsV<ck2)s+sVVV 

3  +  (k,  +k2+kj  ) s  ^  +kj  (k!  +k2  )S 


[A]0 


(3.4) 


which  has  the  same  symbolic  form  as  (2.11)  for  S.  Therefore,  we  consider 
equations  of  the  form  (3.3),  given  here  by 

<Ak2  +  7Bk,  -  tAk3  +  £Bk,  +  £Ck2  (3.5a) 


«ck1k2  -  eBk3(k1+k2) 

(3.5b) 

k^ +k2  —  k^ +k2 +k3 

(3.5c) 

klk2  -  *3  (ki  +k2  ) 

(3 . 5d) 

i  Substituting  (3.5d)  into  (3.5b)  reduces  the  latter  to  the  equation  tc-tB . 

Since  «B  is  a  free  parameter  of  the  model  S2 ,  whereas  «c  is  a  known  constant, 
^  (3.5)  can  be  solved  for  k— ( kn  , k2 , cB )T  in  terms  of  k-(ki ,k2 ,k3 , cB )T  only  at 

{  this  particular  value  of  fB ,  otherwise  the  equations  are  contradictory . 

5 

*  Therefore,  the  models  are  distinguishable.  In  practical  terms,  the  two 

t 

>  models  generate  the  same  response  function  if  and  only  if  cB-«c.  This 
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particular  choice  is,  however,  meaningless  since  species  B  and  C  are  then 
lumped  and  both  models  loose  identifiability .  For  example,  the  common  factor 
s+k2  appears  in  the  numerator  and  denominator  polynomials  of  (2.11).  Notice, 
however,  that  the  models  become  indistinguishable  if  ec  is  an  additional  free 
parameter  of  (1.1). 

Example  3.2.  Consider  the  schemes  (1.1)  and  S1  in  (1.4).  By  (2.9)  and 
(2.11)  the  Laplace  transforms  are  of  the  same  symbolic  form,  whereas 
equations  (3.3)  are  given  by 

«Ak2  +  £akl  “  £A(k_.,+k2)  +  £b kl  (3.6a) 


k,  +  k->  -  ki  +  k  .  +  k-> 


(3.6b) 


ki  k 


1  *2 


k1  k2  ■ 


(3.6c) 


We  have  a  special  situation  here,  since  S  is  obtained  by  setting  k^-O  in  S 

and  hence  S  is  called  a  submodel  of  S.  Thus  for  any  £-(1^  ,k2,eB)Tefi  the 
parameter  value  k-(k.|  ,  0 ,  k2  ,  f  B  )T  satisfies  (3.6)  and  we  have  to  solve  the 
equations  only  for  k  in  terms  of  k.  The  solution  exists  for  all  kt  R4  and 

the  models  are  indistinguishable. 

Requiring  solution  of  polynomial  equations  makes  the  analysis  rather 
tedious.  In  a  number  of  cases,  however,  we  can  take  advantage  of  simple 
conditions  and  avoid  calculations.  We  list  here  the  basic  results  with  the 
underlying  mathematical  ideas 

Proposition  1.  Let  q  and  q  denote  the  numbers  of  determinable  parameters  in 
S  and  S,  respectively.  If  q><q  then  S  and  S  are  distinguishable.23 
Proof:  Since  rank  3^/3k-q,  the  set  tf(fi)  is  a  q-dimensional  manifold  in  some 

Euclidean  space  of  dimension  r>q .  Indistinguishability  implies  4>  (O)-^(Q) 
and  hence  q-q . 

Proposition  2.  If  S  is  a  submodel  of  S  and  the  number  of  determinable 
parameters  is  q  in  both  models,  then  there  exist  open  sets  cO  and  CH  such 
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that  restricted  to  these  sets  S  and  S  are  indistinguishable.24  Proof:  For 

simplicity  let  k=(k, , . . . .k^ , k^+1 )T  and  k-(k, . kq ) ,  thus  S  is  obtained  by 

setting  kq+1-0  and  it  is  structurally  identifiable.  Since  S  is  a  submodel  of 
S,  we  solve  (3.3)  only  for  k  in  terms  of  k.  The  solution  is  kj -kj  , i-1 , . . . , q 
if  kq+i-0.  Since  rank  3<£/k-q ,  by  virtue  of  the  general  implicit  function 
theorem  this  solution  can  be  extended  onto  an  open  neighborhood  ft,  of  the 
point  (k, , . . . , kq , 0)  in  OcR^+1 . 

Proposition  3.  If  S  and  S  are  submodels  of  S  and  all  three  models  have  the 
same  number  of  determinable  parameters,  then  there  exist  open  sets  ft, eft 
and  ft,  eft  such  that  restricted  to  these  sets  S  and  S  are  indistinguishable.25 
Proof:  It  follows  from  Proposition  2  and  transitivity  of  equivalence. 

Proposition  4.  Consider  structurally  identifiable  models  S  and  S  with  the 
same  number  p  of  parameters.  If  Y(s,k)  and  Y(s,k)  are  of  the  same  symbolic 
form  and  d<j>/d k  is  a  square  matrix,  then  there  exist  open  sets  0,00  and  ft,  eft 
such  that  restricted  to  these  sets  S  and  S  are  indistinguishable.26  Proof: 
Since  84>/dk  and  34>/a k  are  pxp  matrices  of  full  rank,  ^(0)  and  4>(0)  are  open 
sets  in  R.p  Their  intersection  contains  some  open  neighborhood  of  OfRp ,  and 
hence  is  not  empty. 

Using  these  propositions  our  examples  can  be  solved  without 
calculations.  Since  the  number  of  determinable  parameters  is  3  for  (1.1),  4 
for  S2  in  (1.4),  and  2  for  S3  in  (1.5),  by  Proposition  1  all  these  models  are 
distinguishable  as  shown  in  Example  3.1.  Both  models  considered  in  Example 
3.2  have  3  determinable  parameters  and  hence  are  indistinguishable  by 
Proposition  2.  By  Proposition  4  considering  ec  as  an  additional  free 
parameter  implies  that  models  (1.1)  and  S2  become  indistinguishable  as 
noticed  in  Example  3.1,  part  (b) . 


IV.  Exhaustive  modeling 

Given  a  reaction  scheme  S  and  experimental  conditions  specified  by  the 
initial  condition  xp )  and  observation  matrix  C  we  use  the  similarity 
transformation  approach  to  generate  all  the  first-order  reaction  schemes 

that  are  indistinguishable  from  S.  As  shown  in  Section  II,  we 
first  construct  the  transformation  matrix  T(f)  that  preserves  all  known 
properties  of  x0  and  C.  With  this  T(f)  the  transformations  (2.14)  yield  the 
most  general  linear  system  that  generates  the  original  response  function  at 
any  keO  and  f.  In  identif iability  analysis  we  imposed  further  constraints  in 
order  to  preserve  the  structure  of  the  reaction  scheme  and  checked  uniqueness 
of  the  corresponding  transformation.  Looking  for  different  models  we  don't 
consider  structural  constraints  here,  but  find  the  first-order  reaction 
scheme  that  corresponds  to  the  transformed  system  matrix  A(k,f)  by 
introducing  the  parameters 

n 

kji  -  aji  ,  i  *  j  ;  ~  -  l  «ji  . 

j  -* 

where  kj i  is  the  rate  coefficient  of  a  first-order  reaction  consuming  species 
i  and  producing  species  j ,  with  index  j-0  denoting  a  species  not  taken  into 
account  among  the  n  species  of  the  original  model.  If  no  such  species  can 
exist,  then  the  mass  conservation  condition  is  assured  by  the  constraints 

n 

l  (k.f)  -  0  (4.2) 

j-’ 

thereby  further  reducing  the  free  components  of  f.  The  transformation  (2.14) 
and  the  reparametrization  (4.1)  then  give  rise  to  the  most  general  (in  terms 
of  the  number  of  nonzero  rate  coefficients)  first-order  reaction  scheme 
S  such  that  with  the  parameters 
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kji (k,f)  -  a j  j (k , f )  (4.3) 

we  have  y ( t , k)-y ( t , k) .  As  will  be  shown,  any  model  indistinqishable  from  the 
original  S  is  a  submodel  of  S  and  hence  this  latter  is  said  to  be  the  "frame" 
model.  In  particular,  at  the  nominal  value  f-f°  the  model  S  reduces  to  S, 
thus  S  itself  is  a  submodel  of  S.  Let  q  and  q  denote  the  numbers  of  deter¬ 
minable  parameters  in  S  and  S,  respectively .  If  q-q ,  then  S  and  S  are 
indistinguishable  by  Proposition  2.S  is  unidentifiable  and  by  Proposition  3 
its  submodels  with  q  determinable  parameters  form  the  set  of  reaction  schemes 
indistinguishable  from  S.  The  next  example  illustrates  this  case. 

Example  4,1.  As  shown  in  Example  2.3.,  the  transformed  matrix  for  the  scheme 
(1.1)  is  given  by  (2.21).  Apply  the  constraints  (4.2)  to  the  columns  of 
(2.21).  It  can  be  readily  verified  that  these  are  satisfied  by 

f6  -l-f2-f5 

thus  we  eliminated  f8  from  (2.21).  Introduce  the  parameters 


kl  -  ki/fs  (4 -5a) 

£-1  -  -k^d+fj/fj)  +  k2f2  (4.5b) 

k2  -  k!  f2  (f2  -l)/f5  +  f2(k,.k2)  +  k2  (4.5c) 

k3  -  k,  (l-f2-f5)/f5  (4.5d) 

then  the  "frame"  model  obtained  is  given  by 

k, 

A  ml  B  (4.6) 

.  v-/- 

1-  n  w  i. 


Since  we  have  5  parameters  (i.e.,  4  rate  coefficients  and  the  extinction 

coefficient  ) ,  whereas  f2  and  f5  are  free,  by  Remark  2.4  the  number  of 
determinable  parameters  q  in  (4.6)  is  3.  Therefore,  (1.1)  and  (4.6)  are 

indistinguishable  and  the  set  of  all  reaction  schemes  indistinguishable  from 


A  .N  A  N 
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(1.1)  consists  of  the  submodels 


of  (4.6)  with  3  determinable  parameters  and  (4.6)  itself.  This  can  also  be 
verified  by  solving  (4.5)  for  k  and  f  in  terms  of  k.  Notice  that  (4.6)  has 
no  identifiable  submodels  with  3  determinable  parameters,  thus  all  models  in 
(4.7),  though  indistinguishable  from  (1.1),  are  unidentifiable.  The  example 
gives  the  correct  solution  of  the  problem  stated  in  Section  I. 

Consider  now  the  case  q>q .  Then  the  original  S  and  the  "frame"  model  S 
are  distinguishable  by  Proposition  1.  As  di scussed ,  for  any  Ice (2  (4.3)  gives  a 
parameter  value  for  the  "frame"  model  S  such  that  y (t , k)-y ( t ,k) ,  but  when 
selecting  a  point  k«fi  generally  there  exist  no  k  and  f  that  satisfy  (4.3). 

In  other  words,  the  "frame"  model  is  too  large  in  the  sense  that  it  can 
generate  all  response  functions  of  the  original  model,  but  the  converse  is 
not  true.  Therefore,  the  parameter  set  0  should  be  restricted  by 
considering  the  submodels  of  S  with  q  <  q  determinable  parameters  and  trying 
to  solve  (4.3)  for  k  and  f.  Though  these  submodels  are  the  only  candidates 
for  being  indistinguishable  from  S,  actual  indistinguishability  should  be 
checked  by  direct  calculation  in  each  case  as  shown  in  the  next  example. 
Example  4,2.  We  generate  the  indistinguishable  schemes  for  the  mechanism 


where  [B]o-[C]o-0  and  the  observed  quantity  is  [BJ.  Since  the  rate 
expressions  do  not  depend  on  [C] ,  it  is  sufficient  to  consider  the  kinetic 
equations  for  x^fA]  and  x2  —  [ B ]  : 
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-  (k1+k2  )  0  1  [  x, 


(4.9) 


U] 


The  parameter  vector  is  k-(ki,k2)T,  and  the  model  is  structurally 
identifiable,  thus  q-2 .  The  most  general  transformation  matrix  satisfying 


the  constraints  of  the  form  (2.15)  is 


T(f)  - 


1  f  ' 
0  1 


(4.10) 


depending  on  a  single  parameter  f.  By  (2.14a) 


A(k,f)  - 


(k1+k2)  -  k,f 


-  (k,+k2)f  -  k,f2 


(4.11) 


Because  of  the  presence  of  species  C  in  the  system,  constraints  (4.2)  are  not 
imposed.  Introducing  the  new  parameters 

ki  -  «2i  -  k, 


k-i  -  a12  -  -fCkT+kj+k!  f) 


(4.12) 


k2  -  -a22 -a, 2  -  f  (k2+k, f ) 
k3  -  -a, ,  -a21  -  k2+k!  f  . 

we  obtain  the  "frame"  model  with  the  structure  (4.6),  but  only  the  4  rate 
coefficients  as  parameters.  Since  these  depend  on  a  single  f,  by  Remark  2.4 
the  number  of  determinable  parameters  is  q«3.  Therefore,  (4.8)  and  the 
"frame"  model  are  distinguishable.  This  can  easily  be  proved  also  by  trying 
to  solve  (4.12)  for  k  and  f  in  terms  of  k.  Since  k^O  and  (4.8)  has  2 
determinable  parameters,  the  candidate  models  for  being  indistingushable  from 
(4.8)  are  only  the  submodels 


e 


S,  :  A 


k,  k2 

S2  :  A  - *  B  - -  C 


(4.13) 


of  (4.6)  with  2  determinable  parameters.  To  obtain  we  assume  k2-kj-0, 
which  is  satisfied  if  f  --^/k^  and  hence  S1  and  (4.8)  are 
indistinguishable.  There  exists,  however,  no  f  value  that  satisfies  the 
equations  k . 1 -kj -0  at  all  k,  thus  S2  is  distingushable  from  the  other  models. 
Notice  that  §i  in  (4.13),  the  only  model  indistinguishable  from  (4.8),  is 
structurally  identifiable  in  contrast  to  the  models  found  in  Example  4.1. 

As  will  be  shown  in  our  last  example,  for  a  slightly  more  complex 
reaction  scheme  there  may  exist  several  identifiable  models  that  are 
indistinguishable  from  the  original  one. 

Example  4.3.  Interpretation  of  growth  and  decay  data  through  the  use  of  the 
reaction  scheme 


(4.14) 


has  been  discussed  by  Carrington.6  In  addition  to  a  statistical  analysis,  he 
showed  that  with  the  initial  conditions  [B]0— [C]0  —  [D]0  — 0  and  observing  only 
[B] ,  (4.14)  is  not  uniquely  identifiable  with  2  solutions  for  the 
parameters.  We  solve  here  the  distinguishability  problem  also  stated  by 
Carrington  in  the  introduction  of  his  paper.  As  in  Example  4.2,  it  is 
sufficient  to  consider  the  kinetic  equations  for  the  species  A  and  B  only. 

The  transformation  matrix  T(f)  is  (4.10)  and  we  obtain  the  "frame"  model 
(4.6)  as  in  the  previous  examples. Its  parameters  are,  however , defined  now  by 
the  relations 


-  - f (k1  f+k! +k3 -k2 ) 

-  k-j  f2  +  (k3  -k2  )f  +  k2 


(4.15) 


-  k,  f  +  k. 


Both  (4.14)  and  the  "frame"  model  have  3  determinable  parameters  and  hence 
are  indistinguishable.  The  further  indistingiushable  models  are  the  two 
submodels 


S,  :  C 


a  B 

k- 1 


S,  :  A 


C  (4.16) 


of  (4.6)  with  3  determinable  parameters . Notice  that  both  latter  models  are 
structurally  identifiable. 

Though  the  above  results  follow  immediately  from  the  propositions  in 
Section  III,  it  is  worthwhile  to  check  solvability  of  (4.15).  For  example, 
indistinguishability  of  S1  and  (4.14)  requires  k2-0.  The  solution  for  f  is 
real  if  and  only  if  the  parameters  of  (4.14)  satisfy  the  inequality 
constraint  D-(k3 -k2  )2 -4k.|  k2>0  .  Thus  the  domain  of  indistinguishability  is 
restricted  to  an  open  subset  of  the  original  parameter  space  fi-R3 .  The 
calculation  also  shows  that  St  is  not  uniquely  identifiable  with  two 
solutions.  Since  the  solution  of  k3-0  always  exists  and  is  unique,  in  the 
case  of  S2  indistinguishability  is  unconstrained  and  the  model  is  uniquely 
structurally  identifiable. 

V.  Conclusions 

Assuming  a  reaction  scheme  and  following  the  reaction  by  observing  the 
quantities  accessible  to  measurements,  the  experiment  does  not  necessarily 
provide  sufficient  information  to  derive  unique  values  for  the  rate 
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coefficients  or  other  unknown  parameters  included  in  the  kinetic  model. 
Similarly,  there  may  exist  further  reaction  schemes  that  are  able  to  generate 
the  same  values  for  the  observed  varibles. 

Both  uniqueness  problems  are  relatively  easy  to  solve  in  the  case  of 
first-order  reaction  systems  with  observed  quantities  depending  linearly  on 
the  concentrations.  The  problems  of  identif iability  (i.e.,  uniqueness  of  the 
parameters)  and  distinguishability  (i.e.,  uniqueness  of  the  reaction  scheme) 
are  so  closely  related  that  the  same  analytical  tools  can  be  used  to  solve 
them.  A  very  simple  method  is  based  on  the  application  of  a  Laplace 
transform  to  the  kinetic  equations  and  results  in  a  set  of  polynomial 
equations  for  the  parameters.  By  solving  these  equations  one  can  check 
identif iability  and  find  the  equivalent  solutions  for  the  parameters  if  the 
model  is  not  uniquely  identifiable.  The  approach  can  be  extended  to  test 
distinguishability  of  two  different  first-order  reaction  schemes.  The  second 
method  we  used  is  based  on  state-space  similarity  transformations.  It  may  be 
less  convenient  to  study  identif iability  than  with  the  Laplace  transformation 
approch,  but  it  can  be  used  to  solve  the  more  general  problem  of  exhaustive 
modeling,  thus  to  generate  all  the  first-order  reaction  schemes  that  are 
indistinguishable  from  a  given  one.  Calculations  can  be  considerably 
simplified  by  taking  advantage  of  general  conditions  for  indistinguish- 
ability,  also  formulated  in  the  paper. 
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Abstract 


A  general  analysis  of  exact  lumping  is  presented.  This  analysis  can  be  employed 
to  any  reaction  system  with  n  species  described  by  a  set  of  first  order  ordinary 
differential  equations  dy/dt  =  f(y),  where  y  is  an  n-dimensional  vector;  f(y)  is  an 
arbitrary  n-dimensional  function  vector.  Here  we  only  consider  lumping  by  means 
of  a  rectangular-  constant  matrix  ^/(i.e.,  y  =  My,  where  M  is  a  row-full  rank  matrix 
and  y  has  lower  dimension  than  y).  It  is  found  that  a  reaction  system  is  exactly 
lumpable  if  and  only  if  the  intersection  of  the  invariant  or  the  null  subspaces  of  the 
Jacobian  matrix  J( y)  of  f(y)  for  all  values  of  y  is  nonempty.  The  intersection  is 
the  null  space  of  the  lumping  matrix  M .  If  the  dimension  of  the  intersection  is  less 
than  n,  nontrivial  lumping  schemes  can  be  obtained.  It  is  proved  that  the  Jacobian 
matrix  can  be  represented  as  a  linear  combination  of  certain  constant  matrices  and 
the  intersection  of  the  invariant  or  the  null  subspaces  of  the  constant  matrices  is 
just  that  of  the  Jacobian  matrix.  After  the  determination  of  the  intersections,  all 
possible  lumping  matrices  can  be  obtained.  The  kinetic  equations  of  the  lumped 
system  can  be  described  as  dy/dt  —  Mf(My),  M  is  any  generalized  inverse  of 
M  satisfying  MM  =  /„.  Several  implications  of  these  lumpability  conditions  are 
investigated  as  well  as  illustrated  by  some  simple  examples. 


I.  INTRODUCTION 


C 


A  problem  which  frequently  arises  in  the  study  of  chemical  kinetics  is  the  high 
dimensionality  and  high  degree  of  coupling  of  the  reaction  system.  For  example,  in 
many  realistic  chemical  processes,  particularly  those  related  to  petrochemistry,  in¬ 
dustrial  processes,  combustion  phenomena  and  atmospheric  chemistry,  the  number 
of  reacting  species  can  often  exceed  102  -  103.  It  is  impractical  to  incorporate  the 
kinetic  equations  for  each  species.  Consequently,  lumping,  by  which  several  species 
are  treated  as  a  single  component,  is  a  necessity.  Thus  one  desires  to  reduce  the 
reaction  mixture  into  a  small  number  of  lumps  in  the  kinetic  study  for  practical 
purposes. 

For  different  reaction  systems  the  suitable  ways  of  lumping  will  likely  be  dif¬ 
ferent.  Even  for  a  given  system,  there  could  be  many  lumped  models,  depending 
on  the  objectives.  However,  one  is  not  able  to  lump  a  system  arbitrarily,  because 
it  is  not  always  possible  to  find  a  model  or  a  set  of  differential  equations  describing 
the  behavior  of  the  lumped  species.  For  lack  of  theoretical  guidance,  researchers 
have  often  spent  many  years  trying  to  find  adequate  lumping  schemes  by  trial  and 
error.  The  modelling  of  catalytic  cracking  for  petroleum  is  a  typical  example.1  Con¬ 
founding  this  approach  is  the  fact  that  the  true  lumped  “species"  may  actually  be 
a  combination  or  function  of  the  original  physical  species. 

Prior  research  clearly  suggests  the  need  for  a  rigorous  study  of  lumping  which 
can  give  useful  guidelines  for  choosing  lumps.  Wei  and  Kuo2  gave  a  lumping  anal¬ 
ysis  of  unimolecular  reaction  system  and  their  work  was  extended  by  Ozawa3  and 
Bailey4,5.  One  of  the  authers6  presented  a  lumping  analysis  for  uni-  and/or  bi- 
molecular  reaction  systems.  Such  research  has  been  largely  confined  to  the  uni- 
and 'or  bimolecular  reaction  systems  with  the  focus  on  establishing  the  necessary 
and  sufficient  conditions  for  “exact  lumping”.  These  analyses  have  shown  that  exact 


3 


lumping  by  a  network  of  uni-  and/or  bimolecular  reactions  is  feasible  only  under  a 
very  restrictive  set  of  conditions.  Studies  of  the  pitfalls  and  magnitude  of  errors  in 
the  use  of  empirical  rate  expressions  for  lumping  many  independent  single  or  con¬ 
secutive  reactions  were  presented  by  Luss  and  his  co-workers.7-10  Unfortunately 
until  now  lumping  theory  was  not  sufficiently  developed  to  give  useful  guidelines  as 
to  which  lumps  to  choose  for  many  problems.  There  are  still  at  least  two  important 
problems  within  exact  lumping,  which  have  not  been  solved  yet. 

1.  There  is  no  known  a  priori  way  to  determine  the  lumping  scheme. 

2.  The  kinetic  equations  can  have  higher  order  nonlinearities  than  quadratic. 
For  instance,  this  situation  can  arise  in  the  presence  of  termolecular  reactions  or 
when  one  uses  equilibrium  or  steady-state  assumptions  to  omit  the  intermediates 
in  reactions.  In  addition,  nonisothermal  processes  or  the  use  of  empirical  rate  laws 
can  lead  to  highly  nonlinear  kinetic  equations.  Therefore  a  general  lumping  analysis 
capable  of  treating  arbitrary  physical  non-linearities  is  necessary. 

Considering  this  situation,  a  general  analysis  of  exact  lumping  is  presented  in 
this  paper.  It  can  be  used  for  any  reaction  system  and  the  previously  studied  lump¬ 
ing  analyses  of  uni-  and/or  bimolecular  reaction  systems  are  special  cases  of  this 
analysis.  In  addition,  this  analysis  can  also  be  applied  in  other  problems  described 
by  a  set  of  first  order  ordinary  differential  equations,  such  as  problems  arising  in 
classical  molecular  mechanics,  chemical  engineering  and  control  theory. 

Section  II  of  this  paper  presents  the  conditions  under  which  a  reaction  system 
is  exactly  lumpable  and  the  corresponding  kinetic  equations  of  the  lumped  system. 
In  section  III,  the  relationship  between  the  Jacobian  matrix  and  its  intersection  of 
the  invariant  or  the  null  subspaces  is  discussed  and  the  methods  to  determine  the 
intersections  are  derived.  Section  IV  provides  some  simple  examples  to  which  the 
general  lumping  method  is  applied.  Section  Y  presents  a  discussion  of  the  results. 
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II.  THE  THEORY  OF  EXACT  LUMPING 

A  .  CONDITIONS  UNDER  WHICH  A  REACTION  SYSTEM  IS  EXACTLY 
LUMPABLE 

Suppose  the  kinetics  of  an  n-component  reaction  system  can  be  described  by 

dy/dt  =  f(y),  (1) 

where  y  is  an  n-composition  vector;  f(y)  is  an  arbitrary  n-function  vector,  which 
does  not  contain  i  explicitly. 

Here  we  only  consider  a  special  class  of  lumping  by  means  of  an  n  x  n  constant 
matrix  M  with  rank  n(n  <  n).  If  a  system  can  be  exactly  lumped  by  the  matrix 
J\/,  it  means  that  for 

y  =  My  (2) 

we  can  find  an  n-function  vector  f(y)  such  that 

dy  /di  =  f(y ).  (3) 

If  y,  is  not  lumped,  row  i  of  M  is  a  unit  vector  e,T  =  (00...01 0...0),  and  y,  =  y, .  In 
this  case,  since  the  lumping  is  exact,  the  solutions  for  y,  and  y,  by  Equations  1  and 
3  are  the  same.  However,  Equation  3  is  simpler. 

From  Equations  1  and  2  we  have 

dy  jdt  -  Mdy  fdt  =  fl/f(y),  (4) 

and  upon  comparing  Equations  3  and  4  we  have 

M  f(y)  =  f(v).  (5) 
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Differentiating  both  sides  of  Equation  5  with  respect  to  y,  we  obtain 


MJ(y)  =  J(y)M, 


where  J{ y)  and  J(y)  are  the  Jacobian  matrices  of  f(y)  and  f(y),  respectively.  As 
the  rank  of  M  is  n,  there  must  exist  generalized  inverses  M  of  matrix  M  satisfying 


MM  =  In  , 


where  is  ri-identity  matrix.  Multiplying  both  sides  of  Equation  6  from  the  right 


bv  M ,  we  have 


MJ(y)M  =  J(y)MM  =  J(  y). 


Substituting  Equation  8  into  Equation  6,  we  obtain 


MJ(y)  =  MJ(y)MM< 


MJ(y)(In  -  MM)  =  0,  (9) 

where  I„  is  n-identity  matrix. 

Equation  9  is  the  fundamental  equation  in  exact  lumping.  Although  it  was 
derived  by  differentiation  of  Equation  5,  it  is  not  a  local  perturbation  theory  result. 
This  comment  follows  from  the  fact  that  we  demand  that  Equation  6  and  subsequent 
ones  be  valid  for  all  physical  values  of  y.  It  is  easy  to  prove  that  the  image  X  of  any 
vector  x  upon  mapping  by  (In  -  MM)  is  in  the  invariant  subspace  of  (7n  -  MM). 


Since 


x  =  (Jn  -  MM)x. 


(In  -  MM)2  =  (In  -  MM 


(In  -  M  M  )X  =  X. 
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However,  the  invariant  subspace  of  ( In  —  MM)  is  the  null  space  of  M,  since 

Mx  =  M(In  -  MM)x  =  (M  -  MMM)x 

=  (M  -  M)x  =  0.  (10) 

For  any  vector  x  in  n-dimensional  space,  Equation  9  shows  that 

MJ(y)(In  ~  MM)x  =  0, 

M J (y)x  =  0. 

Let 

x  =  J(y)x. 

Then 

Mx  =  0.  (11) 

Comparing  Equations  10  and  11,  we  find  that  if  x  is  in  the  null  space  of  M ,  then  so 
is  x.  This  is  valid  only  if  one  of  the  following  two  conditions  is  satisfied:  1)  The  null 
space  of  M  is  the  invariant  subspace  of  J( y).  Therefore,  after  mapping  by  J( y), 
the  image  of  vector  x  in  the  null  space  of  M  is  still  in  the  same  space;  2)  The  null 
space  of  M  is  also  the  null  subspace  of  J( y).  In  this  case,  x  is  a  null  vector  and 
Equations  10  and  11  trivially  hold.  Notice  that  these  arguments  are  valid  for  any 
value  of  y.  Thus  the  conclusion  is  that  there  exists  a  nontrivial  matrix  M  only  if 
the  intersection  of  the  invariant  or  the  null  subspaces  of  the  Jacobian  matrix  J( y) 
for  all  values  of  y  is  nonempty.  This  condition  is  also  sufficient.  Let  the  intersection 
be  spanned  by  the  column  vectors  of  matrix  X.  Then  we  can  use  X  to  represent 
the  intersection.  Since  the  intersection  is  a  subspace  of  the  n-dimensional  space, 
the  column  number  of  X  is  less  than  its  row  number.  If  the  intersection  exists,  one 


can  choose  this  intersection  as  the  null  space  of  the  lumping  matrix  M.  Then  we 
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MX  =  0. 


Transposing  Equation  12,  we  obtain 


XT MT  =  0. 
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There  are  an  infinite  number  of  solutions  of  M  for  a  given  A".  This  equation  can 
be  considered  as  a  set  of  linear  algebric  equations 


A'Tm  =  0. 


All  the  linearly  independent  solutions  of  m  compose  the  matrix  M .  In  some  situa¬ 
tions  we  may  desire  to  keep  a  number  of  species,  say  p,  unlumped.  Without  loss  of 
generality  we  can  consider  the  first  p  species  unlumped  and  all  lumped  species  are 
composed  of  others.  In  this  case  the  lumping  matrix  can  be  expressed  as 


=  (Ip  M 

\0  M  >)' 


where  Ip  is  the  p-identity  matrix;  Mi  is  an  (n  — p)  x  (n  —  p)  matrix.  There  is  an  extra 
restriction  on  determination  of  il/j  described  below.  Let  the  Jacobian  matrices  J(y) 


and  J( y)  be  blocked  as  follows: 


™-(£:  &)• 

**>=(£  £)• 


where  Ji\,Jn,Ji\  and  J22  are  p  x  p,p  x  (n  -  p),( n  -  p)  »  p  and  (n  -  p)  »  (n  -  p) 
matrices;  Jji  ,Ji  21^21  and  J27  are  pxp,px(n -p).(n-p)  >  p  and  (n  -  p)  »  (n  -  p) 
matrices,  respectively.  Using  Equation  6,  we  have 


•^11  —  Ju  < 


I 


M\  Jjj  —  J21 , 


Mi  J22  —  J22  M\ , 


J\2  ~  J12  M\  ■ 


Let  A'i  be  the  null  space  of  il/3 .  Multiplying  both  sides  of  Equation  19  from  right 
by  A’j ,  one  obtains  the  extra  restriction 


JjjA]  —  Jy2My  \  \  —  0. 


If  the  intersection  of  the  invariant  or  the  null  subspaces  of  J22{y)  exists  and  satisfies 
Equation  20,  then  My  can  be  determined. 

We  can  treat  the  general  lumping  problem  in  another  way  bv  considering  the 
corresponding  Green’s  functions  of  J(y)  and  J( y).  For  a  given  initial  value  of  y, 
J( y)  and  J( y)  can  be  represented  as  J(t)  and  J(<).  The  corresponding  Green’s 
functions  G^t)  and  G(t,r )  satisfy  the  following  relations: 


dG{i,T)idt  -  J(t)G(t,r )  =0.  t  >  r. 


G(t,t)  =  In. 


(21a) 


dG(t,r)/dt  -  J(t)G(t,r)  =  0,  t  >  r. 


(22a) 


G(t,t)  =  In 


(22  b) 


From  Equations  21  and  22  we  have 


d(MG(t,  t)  -  G(t,  r)M)/di  = 


=  MdG{t,r)!dt  -  (dG(t,r)/dt)M 
=  MJ{t)G{i.r)  -  j(t)G(t.T)M 
=  j{i)(AIG(t,  r)  -  G{t.r)M) 


K(t,r)  =  MG(t,  t)  -  G(t,r)M.  (23) 

Considering  Equations  21  and  22,  we  have 

dK(t,T)/di  =  J(t)K(l,T),  t  >  r.  (24a) 

A'(r,  r)  =  0.  (246) 

{dK(t,T)/dt)l=T  =  =  0. 

Since  A'(t,t)  and  (dK(t,r)/ dt)i=T  are  all  equal  to  zero,  for  t  >  r  we  have 

A'(i,  r)  =  0,  t  >  t 


or, 

MG(t,r)  =  G(t,r)M,  t  >  t.  (25) 

Equation  25  shows  us  that  the  corresponding  Green’s  function  has  the  same 
property  as  the  Jacobian  matrix.  Therefore,  all  the  results  for  the  Jacobian  matrix 
also  hold  for  the  corresponding  Green’s  function.  Since  the  treatment  is  the  same, 
we  will  only  consider  the  Jacobian  matrix  in  the  following  sections.  The  Green’s 
function  has  some  advantage  in  numerical  calculations,  since  we  can  use  it  to  find 
the  lumping  scheme  along  a  reaction  path  or  a  given  region  of  initial  conditions. 
This  prospect  also  opens  up  the  possibility  of  finding  approximate  lumping  schemes 
valid  only  in  a  desired  region  of  the  composition  space. 

B.  DETERMINATION  OF  THE  KINETIC  EQUATIONS  OF  THE  LUMPED 
SPECIES 

For  the  exactly  lumped  reaction  system,  after  deterining  M  we  have 


f(y)  =  A/f(y), 


(6) 


f(My)  =  M  f(y), 


(26) 


and  this  is  an  identity  for  any  y.  Therefore  let 

y  =  My, 


and  substitute  it  into  equation  26, 

f(MMy)  =  Aff(My), 

f(y)  =  A/f(A/y).  (27) 

Equation  27  does  not  place  any  restriction  on  M  except  that  MM  =  lh.  This 
latter  point  is  important  in  that  the  non-unique  nature  of  M  does  not  effect  the 
form  of  the  lumped  equations  (physical  model)  in  the  exact  case.  Thus  the  behavior 
of  the  lumped  species  can  be  described  bv 

dy/dt  =  Mf(A/y),  (28) 

where  M  is  anyone  of  the  generalized  inverses  satisfying  MM  —  /* .  The  kinetic 
equations  of  the  lumped  species  for  a  given  M  are  unique. 

III.  THE  PROPERTIES  OF  THE  JACOBIAN  MATRIX  AND  ITS  INTERSEC¬ 
TION  OF  THE  INVARIANT  OR  THE  NULL  SUBSPACES 

A.  THE  RELATIONSHIP  BETWEEN  THE  JACOBIAN  MATRIX  AND  ITS  IN¬ 
TERSECTION  OF  THE  INVARIANT  OR  THE  NULL  SUBSPACES 

In  above  section  it  was  shown  that  a  system  is  exactly  lumpable  if  and  only  if 
the  intersection  of  the  invariant  or  the  null  subspaces  of  J( y)  for  all  values  of  y  is 
nonempty.  The  problem  is  how  to  determine  the  intersection.  This  task  appears 
difficult,  because  y  can  take  infinitely  many  values.  Before  presenting  the  method 


to  determine  the  intersections,  we  will  first  discuss  the  relationship  between  the 
Jacobian  matrix  of  the  kinetic  equations  and  its  intersection  of  the  invariant  or  the 
null  subspaces  for  all  values  of  y. 


This  intersection  is  first  determined  by  the  singular  property  of  the  Jacobian 
matrix,  due  to  the  conservation  of  the  total  mass.  Let  m, ,  y,  be  the  mass  and  con¬ 
centration  of  species  t,  and  let  m  be  the  vector  of  m, .  According  to  the  conservation 
of  the  total  mass,  we  have 

mT  y  —  constant.  (29) 

d(mTy);dt  —  mT dy  dt 

=  mTf(y)  =  0.  (30) 

d(mT  f(y)),'dy  =  mTd{{y)/dy 

=  mT  J(y)  =  0.  (31) 

This  shows  that  at  least  one  row  of  the  Jacobian  matrix  is  a  certain  linear  com¬ 
bination  of  the  others.  Therefore  the  Jacobian  matrix  is  singular  for  all  values  of 

y- 

Since  J( y)  is  singular,  the  image  space  A,  of  the  n-dimensional  space  upon 
mapping  by  J( y,)  is  a  subspace  of  it(y,  is  any  given  value  of  y  and  i  can  take 
infinitely  many  values). 

J(yt)/n  =  A',,  (32) 

where  A",  is  composed  of  the  linearly  independent  columns  of  J( y, )  and  has  di¬ 
mension  less  than  n.  It  is  easy  to  demonstrate  that  the  union  of  all  A,  also  has 
dimension  less  than  n. 

Without  loss  of  generality,  let  row  n  of  J(y)  be  a  linear  combination  of  the 
other  rows.  The  bnear  combination  is  the  same  for  any  value  of  y.  Therefore,  any 
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column  vector  of  J( y)  for  tiny  value  of  y  is  located  in  the  same  (n  -  1  (-dimensional 
subspace.  Since  A,  is  composed  of  the  linearly  independent  columns  of  ./( y t ) ,  A\  is 
in  the  ( n  -  1  (-dimensional  subspace.  Since  all  A*,  are  located  in  the  same  subspace, 
then  the  union  of  all  A, 

A'  =  A,(Ja2|J  UA*  (33) 

is  also  in  the  same  (n  —  1  (-dimensional  subspace. 

Similarly,  if  k  rows  of  J( y)  axe  linear  combinations  of  the  others  for  any  value 
of  y,  the  union  A’  would  be  located  in  an  (n  -  ^(-dimensional  subspace.  This  is 
true  for  any  reaction  system  if  we  consider  the  mass  conservation  for  each  atom. 

Now  we  can  prove  that  the  union  X  is  an  intersection  of  the  invariant  subspace 
of  J( y).  We  have 

J( y,)-Y  €  J(yt)In  €  A,  e  A.  (34) 

which  is  valid  for  any  given  y,  and  A  has  dimension  less  than  n.  Therefore  A  is 
the  intersection  of  the  invariant  subspaces  of  J( y). 

Although  mass  conservation  is  a  “trivial"  conservation  property  leading  to 
lumping,  the  subspace  A’  forms  the  starting  point  to  determine  other  intersections 
of  the  invariant  subspaces  of  J( y).  First  we  can  demonstrate  that  any  subspace  in 
the  n-dimensional  space  containing  A’  is  an  intersection  of  the  invariant  subspaces 
of  J( y).  Then  we  can  prove  that  if  any  other  intersection  of  the  invariant  subspaces 
of  d(y),  say  Z ,  with  equal  or  lower  dimension  than  A  exists,  the  intersection  of  Z 
and  X  is  nonempty  and  is  a  new  intersection  of  the  invariant  subspaces  of  J{ y). 
These  statements  are  proved  below. 

Let  V  be  a  subspace  containing  A.  For  any  y,,  we  have 

d(y,)r  €  d(y,)/n  t  A,  €  A  £  V.  (35) 

Then  V  is  an  intersection  of  the  invariant  subspaces  of  J( y). 


Let  Z  be  a  subspace  with  dimension  equal  to  or  less  than  that  of  A  and  the 
intersection  between  Z  and  X  is  empty.  It  is  easy  to  prove  that  Z  can  not  be  an 
intersection  of  the  invariant  subspaces  of  J( y).  Since 

J{y,)Z  €  J(y,)In  £  A;  €  A,  (36) 

and  the  intersection  of  Z  and  A’  is  empty,  the  image  of  Z  upon  mapping  by  J(y, )  is 
out  of  Z.  Therefore,  Z  is  not  an  invariant  subspace  of  J( y,).  Consequently  it  is  not 
the  intersection  of  the  invariant  subspaces  of  J( y).  If  the  intersection  of  Z  and  A\ 
say  II  ,  exists  and  Z  is  an  intersection  of  the  invariant  subspaces  of  J( y),  according 
to  Equation  36  the  image  of  Z  must  be  in  IV.  After  mapping  by  J( y, ),  the  image 
of  any  vector  in  II  is  still  in  it.  Therefore  IV  is  an  intersection  of  the  invariant 
subspaces  of  J( y).  This  implies  that  A’  and  its  subspaces,  which  are  invariant  to 
J(y),  have  a  central  role  in  constructing  all  intersections  of  the  invariant  subspaces 
of  J( y). 

Suppose  Z  is  a  subspace  of  A,  then  Z  is  an  intersection  of  the  invariant  sub¬ 
spaces  of  J(y)  if  and  only  if  the  image  of  Z  upon  mapping  by  J(yt)  is  the  intersection 
of  Z  and  A’,.  To  prove  this  point  suppose  that  Z  is  an  intersection  of  the  invariant 
subspaces  of  J( y),  then 

J(y,)Z  e  z. 

However,  we  also  have 

J{yi)Z  €  A\. 

This  means  that  the  image  of  Z  upon  mapping  by  J( y,)  is  the  intersection  of  Z 
and  A*,.  This  condition  is  also  sufficient.  Suppose  the  image  of  Z  upon  mapping  by 
J(y,)  is  the  intersection  of  Z  and  A',.  Therefore,  the  image  of  Z  upon  mapping  by 
J( y)  for  any  value  of  y  is  in  Z.  Then  Z  is  the  intersection  of  the  invariant  subspaces 


Let  the  intersection  of  Z  and  A",  be  Z, .  Then  we  can  represent  Z  as  follows: 

Z  =  Z1\JZ2\J...\JZi.  (37) 

Equation  37  would  be  useful  for  determining  the  intersections  of  the  invariant  sub¬ 
spaces  of  J( y)  with  dimension  less  than  that  of  A'. 

We  can  consider  this  problem  in  other  way.  Suppose  the  intersection  X  has 
been  determined,  and  a  subspace  Z  of  it  is  invariant  to  7(y).  Z  can  be  described 


Z  =  A'S, 


where  5  is  an  (n  —  h)  x  m{m  <  (n  —  n))  constant  matrix.  Notice  that 


J(y)A  -  A'P(y), 


where  P( y)  is  an  (n  —  h)  x  (n  -  n)  matrix,  and 


J(y)Z  =  J(y)X  S  =  A'SQ(y), 


where  Q( y)  is  an  m  x  m  matrix.  Multiplying  both  sides  of  Equation  39  from  the 


right  by  5,  we  have 


J(y)XS  =  XP(y)S. 


Comparing  Equation  40  and  41,  we  obtain 


XSQ(y)  =  A'P(y)S. 


Considering  that  A’’  is  column-full  rank  matrix,  one  can  always  find  a  generalized 


inverse  A  satisfying 


XX  =  /„_*. 


Multiplying  both  sides  of  Equation  42  from  left  by  A”  gives 


XXSQ(y)  =  X  X  P{y)S, 


II, 

u 


SQ( y)  -  P(y)S.  (44) 

Similarly,  we  can  determine  a  generalized  inverse  5  satisfying 

55  =  /m.  (45) 

Multiplying  both  sides  of  Equation  44  from  left  by  5,  we  have 

55C?(y)  =  5P(y)5, 

Q(  y)  =  SP(y)S-  (46) 

Substituting  Q( y)  into  Equation  44,  one  can  obtain 

SSP(y)S  =  P(y  )5, 

(In—h  -  55)P(y)5  =  0. 

Transposing  this  equation  gives 

ST PT{y){In—h  -  ST ST)  =  0.  (47) 

Equation  47  is  exactly  the  same  as  Equation  9.  This  implies  that  A'  has 
subspaces  invariant  to  J( y)  if  and  only  if  the  intersection  of  the  invariant  or  the 
null  subspaces  of  PT( y)  is  nonempty.  Therefore  we  can  employ  the  same  method 
for  determining  A'  to  determine  Z.  In  this  way  we  can  find  out  all  intersections  of 
the  invariant  subspaces  of  J( y)  with  different  dimensions. 

The  intersection  of  the  null  subspaces  of  J( y)  is  the  solution  of  the  following 
linear  aigebric  equation 

J(y)x  =  0.  (48) 

Moreover,  the  solution  is  independent  of  the  value  of  y.  If  we  consider  y  symbol¬ 
ically,  this  show’s  that  there  is  a  nontrivial  solution  of  Equation  48  if  and  only  if 
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some  columns  of  J( y)  are  linear  combinations  of  the  other  columns.  All  the  linearly 
independent  solutions  of  Equation  48  compose  the  largest  intersection  of  the  null 
subspaces  of  J( y).  Any  subspace  of  this  intersection  is  also  an  intersection  of  the 
null  subspaces  of  J( y). 

B.  DETERMINATION  OF  THE  INTERSECTION  OF  THE  INVARIANT  OR 
THE  NULL  SUBSPACES  OF  J{ y)  FOR  ALL  VALUES  OF  y 

The  Jacobian  matrix  can  be  considered  as  an  v?  vector.  Therefore,  for  any 
value  of  y,  J( y)  can  be  represented  as  a  linear  combination  of  m(m  <  n2)  constant 
mat  rices: 

m 

J(y)  =  Vat(y).lt,  (49) 

fc=i 

where  a*( y)  are  parameters,  which  are  the  functions  of  y;  .4^  are  constant  matrices, 
which  are  considered  as  a  basis  of  J( y).  The  problem  is  how  to  determine  the  basis 
Ak .  There  are  several  ways  to  do  it,  and  one  is  as  follows.  The  Jacobian  matrix 
J( y)  can  be  represented  as 

m 

J(  y)=  Vh,(  y)^.;,  (5°) 

i,j  =  i 

where  jij(y)  is  the  (t,j)-entry  of  J(y );  Exj  is  the  elementary  matrix,  which  is  defined 
as  the  (n  x  n)-matrix  having  unity  in  the  (i , _7 )t h  position  and  all  other  elements 
are  zero. 

If  is  equal  to  cjij{ y),  where  c  is  a  constant,  we  can  combine  these  two  terms 
as 


a*(  y)  =  Jtj(y). 


t 


I 

I 


l 


In  this  way  one  can  combine  the  terms  in  Equation  50  as  much  as  possible  to  obtain 
the  following  simplified  formula 


J{y)  =  55  ^(y)Ak. 

k=  1 


(51) 


where  m  is  less  than  n2 . 

It  is  easy  to  demonstrate  that  the  intersection  of  the  invariant  or  the  null 
subspaces  of  all  constant  matrices  Ak  is  that  of  J( y).  Let  the  n  x  (n  —  n)  constant 
matrix  A’  represent  the  intersection  of  the  invariant  subspaces  for  all  Ak .  Multiply 
both  sides  of  Equation  51  from  the  right  by  A  to  obtain 

m 

J(y)X  =Vat(yMtI, 

k=  1 

m 

=  V  ak{y)XPk, 


k=i 


=  X  V  ak{y )Pk. 

k- 1 


(52) 


where  Pk  are  (n  —  ri)  >  (n  —  fi)  constant  matrices.  Equation  52  shows  that  A'  is  the 
intersection  of  the  invariant  subspaces  of  J{ y). 

Similarly,  we  can  prove  that  the  intersection  A  of  the  null  subspaces  of  all  Ak 
is  also  that  of  J(y). 


J(y)X  =  55  ak(y)*4fcA*, 

k=  1 
m 

=  55  o*(y)°  =  °- 


(53) 


k=l 


If  Equation  49  satisfies  the  restriction  that  in  each  case  we  can  choose  an 
appropriate  value  of  y  such  that  all  ak(yt )  vanish  except  a,(y,).  i.e.. 


«/(y« )  =  a,(y,).4, .  (i  —  1.2 . 


m 


(54) 
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Then  the  intersection  A  of  the  invariant  or  the  null  subspaces  of  J(y )  is  also  that 
of  all  Ait.  Multiplying  both  sides  of  Equation  54  by  A’  from  the  right,  when  A'  is 
the  intersection  of  the  invariant  subspaces  of  J( y),  we  obtain 


J(y,)A  =  at(y,)4,A', 

A'P(y,)  =  al(y,)AtA. 

Since  a,(yt)  is  not  equal  to  zero,  then 

AtX  =  A'P(y,  )  'at (y, ).  (55) 

If  A  is  the  intersection  of  the  null  subspaces  of  J( y),  similarly  we  have 

J(  y«)A‘  =  o,ty,)A,  A, 

0  =  a,(y,  ).4i  A’, 

0  =  .4,  A.  (56) 

Equations  55  and  56  show  that  A'  is  the  invariant  or  the  null  subspace  of  .4,.  Since 
this  is  valid  for  all  A*,  then  A’  is  the  intersection  of  the  invariant  or  the  null  subspaces 
for  all  Afc.  Thus  we  can  determine  the  intersections  of  J( y)  by  only  determining 
the  intersections  of  all  Ak- 

When  the  reaction  system  is  a  uni-  and/or  bimolecular  reaction  system  de¬ 
scribed  by  linear  or  quadratic  first  order  ordinary  differential  equations,  the  ele¬ 
ments  of  J( y)  are  only  linear  functions  of  y*s.  In  this  case.  Equation  51  will  have 
a  simple  form,  i.e.,  afc(y)  is  constant  or  i/*. 


where  m  is  equal  to  or  less  than  n,  and  ,4o  can  be  a  null  matrix.  As  Equation 
57  satisfies  the  above  restriction,  we  can  determine  ail  intersections  of  J{ yl  by 
determining  those  of  A0  and  ail  A^. 

Notice  that  the  invariant  subspace  of  a  constant  matrix  contains  at  least  one 
eigenvector  of  it.  This  property  presents  a  method  to  establish  the  intersection  of 
the  invariant  subspaces  of  A0  and  all  Ak ■  First,  the  eigenvectors  of  and  all 
A*  are  determined.  Then  consider  all  possible  combinations  of  these  eigenvectors. 
Each  combination  contains  at  least  one  eigenvector  of  every  constant  matrix.  The 
linearly  dependent  eigenvectors  are  cancelled  in  each  combination.  The  resultant 
combinations  are  examined  for  .40  and  .4*  whether  they  are  invariant  to  all  the 
matrices.  If  a  combination  is  invariant  to  ,40  and  all  A *..  it  is  an  intersection  of 
the  invariant  subspaces  of  these  matrices.  We  can  also  determine  all  intersections 
of  the  invariant  subspaces  of  J( y)  by  first  determining  A  and  then  its  subspaces  Z 
invariant  to  J{ y). 

The  determination  of  the  intersection  of  the  null  subspaces  is  easier.  We  need 
to  find  the  common  solutions  for  the  following  equations: 

.40x  =  0.  (58) 

.4fcX  =  0.  (A:  =  1,2 . m)  (59) 

In  order  to  obtain  the  common  solutions,  we  put  .40  and  all  together  and  omit 
the  linearly  dependent  rows.  Then  we  obtain  a  constant  matrix  .4  and  solve  the  set 
of  linear  algebric  equations 

Ax  =  0.  (60) 

All  the  linearly  independent  solutions  of  x  compose  the  largest  intersection  of  the 
null  subspaces  of  .4n  and  all  .4*.  Most  importantly  any  subspare  of  the  largest 
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intersection  is  still  an  intersection  of  the  null  subspaces  of  these  matrices.  The  pro¬ 
cedure  of  determining  the  intersections  will  be  illustrated  by  some  simple  examples 
below . 

IV.  APPLICATION  TO  UNI-  AND/OR  BIMOLECULAR  REACTION  SYSTEMS 

As  examples  of  the  application  of  this  analysis,  we  choose  uni-  and/or  bimolec- 
ular  reaction  systems.  As  pointed  out  above,  in  this  case  the  Jacobian  matrix  can 
be  described  as 

m 

J{ y )  =  Ap  +  Y  yk  Ak ■  (61) 

k=  1 

For  a  unimolecular  reaction  system,  the  kinetic  equations  are 

dy/dt  =  Ay,  (62) 

where  A  is  the  rate  constant  matrix.  The  Jacobian  matrix  for  the  unimolecular 
reaction  system  is  just  A\ 

J(  y)  =  A\  (63) 

In  this  case,  there  does  not  exist  the  problem  of  intersections,  because  J{ y)  is  a 
constant  matrix.  The  subspace  spanned  by  a  subset  of  eigenvectors  is  the  invariant 
subspace  of  a  constant  matrix.  If  the  matrix  has  full  eigenvectors,  the  subspace 
spanned  by  the  eigenvectors  with  zero  eigenvalue  is  the  null  space  of  the  matrix. 
Therefore  any  unimolecular  reaction  system  is  exactly  lumpable  and  the  lumping 
schemes  are  easy  to  obtain  after  determining  the  eigenvectors  of  the  rate  constant 
matrix.  This  behavior  can  be  shown  by  a  simple  example. 

Example  I 
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A  unimolecular  reaction  system  with  3  species  is  described  as  follows:2 

3 


-4] 


10 


10 


where  Aj,A2  and  A3  represent  the  three  species;  ail  numbers  are  unitless  rate 
constants.  Let  y,  represent  the  concentration  of  species  .4,.  Then  the  corresponding 
kinetic  equations  can  be  described  as 


dy  'dt  =  Ay, 

where  y  is  the  concentration  vector;  A  is  the  rate  constant  matrix. 


(64) 


K  = 


-13 

2 

4  \ 

3 

-12 

6 

(65) 

V  10 

10 

-10/ 

=  d(Ky)/dy 

=  K. 

(66) 

The  eigenvector  matrix  A*  and  the  eigenvalue  matrix  A  of  J( y)  are 


0.2  0.2  1 

A'  =  |  0.3  0.3  -1 

0.5  -0.5  0 


(67) 


0  0  0 

A  =  |  0  -20  0 

0  0  -15 


(68) 


Since  any  subspace  spanned  by  a  subset  of  eigenvectors  is  an  invariant  subspace 
of  «7(y),  we  choose  the  1-dimensional  subspace  spanned  by  the  last  eigenvector. 
Using  Equation  14  we  have 


(1  -1  0) 


-  0. 


(1  =  1.2) 


(69) 
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Solving  this  equation  we  obtain  the  solution  for  the  lumping  matrix  M: 


rri]  =  ( c 

c 

o)J 

3 

K> 

II 

o 

0 

d\ 

U  o 

II 

< 

c 

0\ 

0 

d 

(70) 

where  c,  d  are  arbitrary  constants.  We  want  M  to  have  a  full  row  rank,  thus  requiring 
c,  d  7^  0.  A  special  case  is  c  =  d  =  1,  i.e., 


M 


(l  1  °V 

Vo  0  \) 


(71) 


For  this  M  we  can  find  an  infinite  number  of  M  satisfying  MM  =  /2.  We  arbitrarily 
choose  two: 

/  0.5  0\  /  0.4  0 

Mi  =  0.5  0  ,  M2  =  0.6  0 

V  0  1/  V  o  i 

It  is  easy  to  show  that  the  kinetic  equations  for  the  lumped  system  are  the  same  in 
spite  of  using  different  M .  According  to  Equation  27 

/( y)  =  Mf(A/y), 


and  since 


then 


For  A/j  we  have 


f(y)  =  A'y> 


f(y)  =  MK  My. 


f(y)  = 


/i  i  o\ 

vo  o  i; 

_  (-10  10  V 

■  V  10  -10  ) 


-13 

3 

10 


2 

-12 

10 


4 

6 

-10 
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0.5  O’ 
0.5  0 

0  1 


(72) 


1 


** 


t , 


r  I 
4. 

V 


3 


K 

*2 

«*? 

ii 


VI 


Q 


Similarly  for  J\/2  we  have 


*>-(i  o -«  :)r  t 


-10  10 


10  -10 


The  reaction  scheme  of  the  lumped  system  can  be  described  as 


dy/dt  =  h'y, 


where  y  =  (yj  ,y2)r  and 


-10  10 


10  -10 


EXAMPLE  II 


A  uni-  and  bimolecular  reaction  system  with  8  species  is  illustrated  as  follows:13 


Aj  +  Aj 


A3  +  A4 


4  1  2 


l\  A5  —  Ar  /  2 


where  the  A^s  are  species;  the  numbers  are  unitless  rate  constants. 


I 


t 

\ 

i 

t 
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Leting  y,  represent  the  concentration  of  A,,  it  is  easy  to  write  out  the  kinetic 
equations  and  the  corresponding  Jacobian  matrix  J( y). 


dy i  /d<  =  -2yj  -  2y}y?  +  4 y3y4 
dy2/d<  =  -2y2  -  2yiy2  +  4y3y4 
dyi/dt  =  -2y3  -  4y3y4  +  2 y4y2 


dy4/dt  =  -2y4  -  4y3y4  4-  2 y:y2 
dy5/d<  =  -y5  +  y2  +  2y2  +  \/2y6 
dy6  (dt  =  -  v^ys  +  2y3  +  y5 
dy7/d<  =  -v^yT  +  yi  +  yg 


(74) 


*%)  = 


v 


dy 8 /df 

=  -y«  +  2  y4  +  \/2y7 

2(1  +  V2> 

-2yi 

4y4 

<y3 

—  2j/2 

-2(1  +  VI  ) 

<V4 

4V3 

2y2 

2yj 

-2(1  +  2y4  ) 

-4y3 

0 

2y2 

2si 

— 4y4 

-2(1  +  2y3  ) 

i 

2 

0 

0  -1 

0  0 

0 

0 

2 

0  1 

-V^ 

0  0 

1 

0 

0 

0  0 

0 

1 

0 

0 

0 

2  0 

0 

\/2 

This  matrix  can  be  represented  as 

4 

J(y)  =  A0  +  \^ykAk, 

k=\ 

where 

/  — 2  0  0  0 
0-200 
0  0-20 
.  _  0  0  0  -2 

A°  ~  1  2  0  0  -1 

0  0  2  0  1 

1  0  0  0  0 

V  0  0  0  2  0 


0 

\2_  0  0 
-  v  2  0_  0 

0  -\2  1 
0  V2  -1  / 
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Ax  = 


A3  = 


-2  0  0 


-2  0  0  0 


1  0 

-2 

0 

0 

-2 

0 

0 

0 

0 

2 

0 

0 

0 

2 

0 

0 

0 

0 

0 

2 

0 

0 

A2  — 

2 

0 

0 

0 

0 

0 

0 

0 

V 

) 

V 

/° 

0 

0 

4 

\ 

(° 

0 

4 

0 

0 

0 

0 

4 

0 

0 

4 

0 

0 

0 

0 

-4 

0 

0 

0 

-4 

0 

0 

0 

0 

0 

-4 

A  4  — 

0 

0 

-4 

0 

0 

0 

0 

0 

/ 


V 


\ 


) 


The  corresponding  eigenvector  matrices  Xa0  ^ Ak  "ith  their  eigenvalues 
are  as  follows: 


Xx  - 


—  2. 


(1  -  V2)/2 


-2, 

—  2, 

—2, 

2  - 

1 

1  -  \/2 

(y/i  -  3)/2 

—  1 

v/2  ~  1 

-1/2 

-1/2 

(1  -  s/2)/2 

*/y/i 

1/2 

(y/i  ~  l)/2 

-(1  +  v72). 


-O  +  \'2). 


0 


=  -2  0 


XA,  = 


XA,  = 


1  1 


1  0 


-1  -1 


-1  0 


0  0 


0  0 


0  0 


0  0 


-2  0 


1  0 


1  1 


-1  -1 


-1  0 


0  0 


0  0 


0  0 


0  0 


0  0  0  0  0 


0  0  0  1  0 


0  0  0  0  0 


0  0  0  0  0 


0  0  0  0  0 


1110  1 


-10  000 


0-1000 


0  0  -1  0  0, 


0  0  0  0  0 


0  0  0  0  0 


0  0  0  1  0 


0  0  0  0  0 


0  0  0  0 


1110  1 


-1  0  0  0  0 


0  -1  000 


0  0-100 


A,  = 

-4 

0 

0 

0 

0 

0 

0 

0 

* 

. 

1 

1 

0 

0 

0 

0 

0^ 

1 

-1 

0 

0 

0 

0 

0 

0 

-1 

0 

-1 

0 

0 

0 

1 

0 

• 

Xas  = 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

0 

1 

0 

0 

0 

-1 

0 

0 

0 

0 

» 

0 

0 

0 

0 

-1 

0 

0 

0 

V  o 

0 

0 

0 

0 

-1 

0 

0/ 

* 

A,  = 

-4 

0 

0 

0 

0 

0 

0 

0 

, 

1 

] 

0 

0 

0 

0 

0  ^ 

1 

-1 

0 

0 

0 

0 

0 

0 

• 

-1 

0 

0 

0 

0 

0 

0 

0 

A'a4  = 

-1 

0 

-1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

1 

0 

1 

* 

0 

0 

0 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

0 

0 

0 

l  0 

0 

0 

0 

0 

-1 

0 

0/ 

'  Using  the  methods  presented  in  section  III  B,  one  can  obtain  all  possible  inter¬ 

sections  with  different  dimensions  of  the  invariant  and  the  null  subspaces  of  J( y). 
First  we  use  the  singular  property  of  the  Jacobian  matrix  to  determine  the  inter- 
^  section  of  the  invariant  subspaces  of  J{y).  In  this  example,  we  have 

lTJ(y)  =  0T. 

e 


V 
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where  1T  =  (1  1  ...  1).  Therefore,  any  column  of  J{ y)  for  any  value  of  y  is  located 
in  the  same  (n  —  l)-dimensional  subspace.  This  subspace  can  be  constructed  by 
Equation  33. 

a-  =  a'0|Ja-1|J..-Ua'4' 

where  A'0  and  Xk  are  images  of  n-dimensional  space  upon  mapping  by  A0  and  Ak, 
which  are  composed  of  the  linearly  independent  columns  of  A$  and  Ak.  Then  we 
have 


/-2 

-2 

4 

4 

-2 

0 

0 

0 

0 

0  \ 

-2 

-2 

4 

4 

0 

-2 

0 

0 

0 

0 

2 

2 

-4 

-4 

0 

0 

-2 

0 

0 

0 

2 

2 

-4 

-4 

0 

0 

0 

-2 

0 

0 

0 

0 

0 

0 

1 

2 

0 

0 

-1 

0 

0 

0 

0 

0 

0 

0 

2 

0 
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0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

V  0 

0 

0 

0 

G 

0 

0 

2 

0 

-1/ 

After  omitting  the  linearly  dependent  columns  the  intersection  X  is  obtained. 


/— 2— 20  0  0  0  0  \ 
-2  0  -2  0  0  0  0 

2  0  0  -2  0  0  0 

2  0  0  0  -2  0  0 

0  1  2  0  0  -1  0 

0  0  0  2  0  1  0 

0  1  0  0  0  0  1 

\0  0  0  0  2  0  -1  / 


If  we  change  the  bases  of  this  subspace,  it  can  be  described  by  a  simpler  form: 
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3 


1  1 

-1  0 


1  1 
0  0 


1  1 
0  0 


A,  = 


0  0 
0  0 


0  0 

-1  0 


0  0 
0  0 


0  0 
0  0 


0  0 
0  0 


0  0 
0  0 


0  0  0  0 


-1  0 
0  -1 


It  is  easy  to  prove  that  any  column  of  X  is  a  linear  combination  of  the  columns  of 


A'j .  Then  the  two  matrices  are  equivalent  to  represent  the  subspace. 


Now  w'e  use  the  eigenvectors  of  Aq  and  .4*  to  construct  the  intersection  of  the 


invariant  subspaces  of  J( y).  After  examining  the  eigenvector  matrices  we  find  that 


the  first  6  columns  of  A',t0  and  Xa3  to  A’/i4  are  linear  combinations  of  the  same 
columns  of  A'^, .  Therefore  the  subspace  A’2  spanned  by  the  first  6  columns  of  A 


is  an  intersection  of  the  invariant  subspaces  of  J(y). 


1  1 
0  0 


-1  0 
0  -1 


1  1 
0  0 


-1  0 


After  examination  we  can  find  that  the  subspaces  A3  of  A~2 


0  0 
0  0 


A  3  = 


0  0 
0  0 


1  1 

-1  0 


0  -1 

0  0 


.VkVVttALWMJttUkUV-rf 
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is  invariant  to  .4o  and  .4*.  Therefore  it  is  also  the  intersection  of  the  invariant 
subspaces  of  J( y).  Similarly  the  two  subspaces  A4,  A 5  of  A  and  the  union  A6  of 
A 4  find  A5 


/ 0  V 

( 0  V 

(  0 

0  V 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

II 

10 

K 

0 

A~6  = 

-1 

0 

1 

0 

1 

0 

0 

1 

0 

1 

Vo  ) 

V-i  / 

V  0 

-1/ 

are  intersections  of  the  invariant  subspaces  of  J( y). 

The  intersection  of  the  null  subspaces  of  J( y)  can  be  determined  by  Equation 
48  and  the  solution  is  A't  ■ 


At 


/  0  ON 
0  0 
0  0 
0  0 
V2  0 
1  0 
0  1 
Vo  V2I 


For  these  intersections  the  corresponding  lumping  matrices  Mi 


are  determined 


by  Equation  13.  All  A/,  are  arranged  below’  by  dimension  in  increasing  order  except 
A/7,  which  is  different  from  the  others  and  will  be  discussed  later. 
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We  can  determine  the  kinetic  equations  of  the  exactly  lumped  systems  using 
Equation  28 


dy/di  =  il/f(A/y). 


(28) 


For  My  we  arbitrarily  choose 


Mi  =  (  1  0  ...  0)T. 


Since  y  =  A/y,  we  have 


yi  =  y- 


Vx  =  0-  (t  =2,3 . 8) 


# 


Substituting  these  relations  into  Equation  28  and  74.  the  lumped  kinetic  equation 


dy  di  =0.  (y  =  V  yt) 

t=  l 

For  A/2  we  arbitrarily  choose 


(\  0  0  0  0  0  0  o\T 
\0  0001000/ 


Similarly  we  have 


yi  =  y\  •  ys  =  y2 , 


y,  =0.  (t  =  2, 3, 4, 6, 7, 8) 


(75) 


Substituting  them  into  Equation  28  and  74,  the  lumped  kinetic  equations  are 


dy\  /dt  =  -2yj , 
dy2  /'dt  -  2y, . 


(76) 


Th  is  kinetic  equations  can  be  described  by  a  unimolecular  reaction  scheme: 
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C 


Ax  -  Ax  (i  -  1,2,  3,4),  Ax  -  Al+i  ( *  -  6,7),  -45  -  -45  4  .46 


ii  -  Ai  (*  =  1, 2, 3, 4, 5, 6),  i-  =  -4 7  4  -48 . 


The  last  lumping  scheme  M-,  can  only  be  described  by  a  simplified  set  of  differ' 
ential  equations,  since  it  does  not  follow  a  uni-  and/or  bimolecular  reaction  scheme. 
The  condition  under  which  a  lumped  system  follows  a  physical  uni-  and/or  bimolec¬ 
ular  reaction  scheme  has  been  discussed  in  References  [2]  and  [6].  The  differential 
equations  for  the  lumped  system  of  M-,  is  given  as 
dyi/dt  =  -2yi  -  2y,y2  4  4y3y4 
dy7/dt  =  -2 y2  ~  2yjy2  4  4y3y4 
dy3/dt  =  —2 y3  -  4y 3y4  4  2 y}y2 

(77) 

dy4/dt  =  -2y4  -  4y3y4  4  2y:y2 

dy5/dt  =  -yj  +  2y2  4  2v/2y3  -  (1  4  \/2)y5 

dyefdi  =  -\/2y,  4  2y4  -  (1  4  \Z2)ye 

where 


y.  =  y«,  (t  =  1.2. 3, 4) 
t/5  =  -ys  +  v  2yc . 
y6  =  -y/2yi  4  y8 . 
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This  lack  of  a  corresponding  chemical  mechanism  for  the  lumped  system  may 
often  arise  in  lumping,  and  there  is  no  practical  difficulty  in  this  situation.  Since 
the  lumping  scheme  above  is  exact,  the  simplified  model  will  give  exactly  the  same 
values  of  the  lumped  species  as  those  given  by  the  original  one.  Notice  that  y ,  =  y , 
for  i  =  1,2, 3, 4.  If  one  only  considers  yj  to  y4.  Equation  77  will  give  the  same  results 
as  those  given  by  Equation  74.  However,  Equation  77  is  simpler,  even  though  it 
does  not  follow  uni-  and/or  bimolecular  reaction  schemes. 

These  examples  are  very  simple,  however,  they  illustrate  the  method  which  can 
be  applied  to  other  more  complicated  systems. 

V.  CONCLUSION  AND  DISCUSSION 

In  this  paper  a  genaral  analysis  of  exact  lumping  has  been  given,  which  can  be 
used  for  any  system  described  by  a  set  of  first  order  ordinary  differential  equations 
with  any  degree  of  nonlinearity.  Uni-  and/or  bimolecular  reaction  systems  are  only- 
special  cases  of  this  general  analysis. 

The  kinetic  properties  and  the  coupling  pattern  of  the  reactions  in  the  exactly 
lumpable  system  must  satisfy  some  restrictions,  which  are  reflected  by  the  properties 
of  their  Jacobian  matrix  and  the  corresponding  Green’s  function.  The  intersections 
of  the  invariant  and  the  null  subspaces  of  the  Jacobian  matrix  represent  possible 
lumpabilities  of  a  given  complex  reaction  system.  A  systematic  method  to  determine 
the  intersections  of  the  invariant  and  the  null  subspaces  of  the  Jacobian  matrix  and 
the  corresponding  lumping  schemes  was  developed.  Using  the  generalized  inverse  of 
the  lumping  matrix,  the  differntial  equations  of  the  lumped  system  can  be  readily 
obtained,  and  the  non-unique  nature  of  the  generalized  inverses  does  not  effect  the 
form  of  the  lumped  equations  in  the  exact  case. 
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Although  some  useful  results  about  exact  lumping  have  been  obtained,  there 
is  still  further  work  to  do.  Systematic  application  of  this  analysis  to  complex  reac¬ 
tion  systems  needs  to  be  considered.  However,  in  the  treatment  of  actual  reaction 
systems,  the  first  problem  encountered  will  likely  be  their  non-exact  lumpability. 
Very  few  systems  satisfy  the  restrictions  for  the  existance  of  exact  lumping.  Some¬ 
times,  even  if  a  system  is  exactly  lumpable,  the  results  may  not  meet  practically 
desired  goals.  For  example,  in  the  C0/H20/02  combustion  system  we  would  like 
the  easily  measurable  concentrations  of  CO,  C02,  02,  H20  to  be  unlumped13.  With 
this  constraint ,  the  system  likely  can  not  be  exactly  lumped,  and  we  have  to  lump 
the  other  species  of  the  system  approximately.  Developing  a  general  approach  for 
approximate  lumping  is  very  important  for  realistic  problems.  The  exact  lumping 
analysis  presented  above  should  form  a  rigorous  starting  point  for  the  development 
of  approximate  lumping. 

NOTATION 

Scalars 

ajj(y)  =  fcth  coefficient  of  a  linear  combination  of  matrices 
.4,  =  tth  species  of  a  reaction  system 

c  =  constant 

d  =  constant 

Jij(y)  =  (t,j)-entry  of  matrix  J(y) 

m  —  dimension  of  matrix  Q(y) 
m,  =  mass  of  species  .4, 
m, j  =  (t,j)-entry  of  matrix  M 

n  =  dimension  of  vector  y 
=  dimension  of  vector  y 


n 


p  =  dimension  of  identity  matrix 

t  =  time 


t 


■jnuoLM 


y*  =  kih  element  of  vector  y 


Vectors  and  Matrices 


Capital  letters  represent  matrices;  bold-face  lower  case  letters  represent  vectors. 


=  constant  matrix 


—  constant  matrix 


=  unit  vector  with  1  as  its  ith  element,  and  0  for  the  rest  of  the  elements 


=  elementary  matrix  with  1  as  its  (t,j)-entry,  and  0  for  the  rest  of  the  elements 


=  n-dimensional  function  vector 


=  n-dimensional  function  vector 


G(t,r )  =  Green’s  function  of  J( y) 


G{1,t)  —  Green’s  function  of  j{ y) 


=  identitv  matrix 


=  Jacobian  matrix  of  f(y) 


=  submatrix  of  J(y) 


=  Jacobian  matrix  of  f(y) 


=  submatrix  of  J( y) 


=  rate  constant  matrix 


K(t,r)  =  defined  as  AIG(t,T )  —  G(t,r)M 


—  lumping  matrix 


—  submatrix  of  M 


=  n-dimensional  vector  which  ith  element  m,  is  the  mass  of  species  .4*  or  the 


row  vector  of  lumping  matrix  M 


=  generalized  inverse  of  M  satisfying  MM  — 


P  —  (n  —  n)  x  (n  —  n)  constant  matrix 

P( y)  —  (n  —  n)  x  [n  —  n)  function  matrix 

Q(y)  =  m  x  m  function  matrix 

5  =  (n  —  n)  x  m  constant  matrix 

5  =  generalized  inverse  of  5  satisfying  SS  =  Im 

VT  =  intersection  of  Z  and  A' 

X  =  n-dimensional  vector 

x  =  image  of  x  upon  mapping  by  (/n  -  MM) 

x  =  image  of  x  upon  mapping  by  J{ y) 

A'  =  n  x  (n  —  n)  constant  matrix  or  intersection  of  the  invariant  subspaces  of  the 
Jacobian  matrix  J( y) 

A  =  generalized  inverse  of  A”  satisfying  XX  =  in-a 

A'j  =  intersection  of  the  invariant  subspaces  of  the  Jacobian  matrix  J( y) 

A\  =  invariant  subspace  of  J(y^) 

A  >1^  =  eigenvector  matrix  of  .4* 

y  =  n-dimensional  variable  vector 

y  =  n-dimensional  variable  vector 

Y  —  subspace  of  n-dimensional  space,  which  contains  A 
Z  =  n  x  m  constant  matrix  or  subspace  of  A”,  which  is  invariant  to  J( y) 

Z{  =  intersection  between  Z  and  A, 

Greek  Letters 

Aj  =  ith  eigenvalue  of  matrix  Ak  or  K 

A  =  diagonal  eigenvalue  matrix  of  matrix  K  with  A,  as  its  diagonal  elements 
Symbols 

=  any  property  related  to  the  lumped  system 
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■a 


0  =  null  vector 
0  =  null  matrix 


* 


# 


m 


c 
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